How to get the Broadcom network drivers working with on a HP BL460c Gen8 with CentOS8

When CentOS8 came out I was all excited. It shipped with a newer kernel, version of PHP that was not EOL and a lot of other goodies. My first few installs on “regular” servers went on without a hitch. I then tried to install it on some Generation 8 HP Blade BL460c’s and it worked fine, everything that is except networking. With the help of some people in the CentOS group on Facebook.

Since the blades are sitting in a data center the only way for me to set them up remotely is via the ilo. I put the CentOS8 DVD up a NGNGX host and was easily able to install the operating system. As mentioned above all was fine except there was no networking. The suggestion was made that I run lspci -nnk which gave me this

As you can see the ID of the device is 103c:337b. If we look at the removed adapters list form RedHat we can see that support for our driver was removed. There is no reason to fret elrepo has a driver for it. If we look at the supported device ID’s for elepo we can clearly see 19A2:0710 under be2net.ko. Next I went hunting for elrep8 specifically for be2net and found http://mirror.rackspace.com/elrepo/elrepo/el8/x86_64/RPMS/. But how do we get this rpm onto a server sitting 60 miles away you ask? Very easily with mkisofs. Below are the exact steps that I ran:

On the NGNCX host.

  1. mkdir /tmp/drivers
  2. cd /tmp/drivers
  3. wget http://mirror.rackspace.com/elrepo/elrepo/el8/x86_64/RPMS/kmod-be2net-12.0.0.0-2.el8_0.elrepo.x86_64.rpm
  4. mkisofs -o drivers.iso /tmp/drivers/
  5. mv drivers.iso /usr/share/nginx/html/

I then connected to the ilo and mounted the iso that was on my NGNX server. I then did the following

  1. mkdir /mnt/cd # create a directory for the iso that we created
  2. mount /dev/sr0 /mnt/cd
  3. cd /mnt/cd
  4. rpm -i mkod_be2.rpm
  5. modprobe be2net
  6. ip a s # This now showed both network cards.

I did a reboot and sure enough on reboot my network cards were there.

No it wasn’t the network (well sort of).

When it comes to problem solving people tend to go in the direction that are most familiar. If a client is having an issue the developer will look in their code and the network engineer may look at port statistics on their switch. As one who wears the engineer hat most of the time I tend to take the OSI model route

Over the last few days a good portion of my time has been taken up by a VIP client that was having what seemed to be random call quality issues. The customer has multiple offices around Europe and is using AWS to host their PBX. We use VoIPMonitor to trace all of our calls. In addition I built tooling so that we run mtr traces once a minute to all IP’s so should there be a network issue. As I have said many times in the past I don’t want to have to reproduce the issue I want to able to back in time and see what the problem was so we can fix it right away. As soon as the complaints came in we went straight VoipMonitor and the traces. All the call captures shows packet loss and jitter. The mtr traces did show some jitter but it did not match what we were seeing in the call captures. The call captures were showing a much grimmer (and what ended up being more accurate) picture. We asked the usual questions and got the standard response. They watch their servers, everything was fine. They insisted the issue was with our ISP.

At this point I suggested that perhaps the issue was not between us (as I suspected) but from their AWS instance to their other offices. They didn’t think so but were willing to entertain the idea. I asked the client to set up traces from their AWS site to their offices so we can see perhaps where the issue was they assured me they would. Sure enough the next day there were issues. When I asked for the traces they said they didn’t have them as the traffic was tunneled between sites so a trace would not help. I explained that even if it was P2P if there was some sort of internet issue we would see it in the form of jitter and or packet loss from the remote device. I went so far as to launch an AWS instance in the same region as the client with an IP in the same /15. I then set up scripts to run traces to all of their IP’s to see if I could replicate the issue.

Fast forward to this morning and the issues started again. I looked at the traces from my AWS instance and the traffic was clean. We did a screen share to the clients server where I did several traces to the endpoints. I had the following

  • MTR to the remotes public IP
  • MTR to the remotes internal IP (where the traffic was going over IPSEC)
  • MTR to our site (for comparison)
  • Ping to the remotes external IP
  • Ping to the remotes interal IP (where the traffic was going over IPSEC)

On my AWS instance I was doing both a ping and mtr trace.

Right away both the mtr and pints were showing jitter. There was on average a 20ms divide (ping times were anywhere from 65ms to 90ms on average). MTR made things look a bit worse than they were since it would show a max ping time of up to 300ms. As a rule unless the ping times are all over the place or they are always high if one in a few thousand packets has a delay I don’t pay much attention to it. We watched the screens for a while and there was clearly a networking issue. The pings and traces from my AWS box were coming back clean. There was almost no jitter and the ping times for the most part were consistent. At this point I was thinking that perhaps the issue was with the host instance and maybe it being overloaded. Perhaps there was another instance on the host box that was under attack. I started to poke around a bit more. As mentioned above the client said the box was clean so I didn’t look at the load right away. At this point I was out of ideas. I ran top and sure enough the load average was at 18. The box had 8 cores which means there were more tasks waiting for the CPU then it was capable of handling. If you have a script that is doing compression, if the CPU is overloaded an extra moment wont matter. On the other hand with voice it absolutely matters. If a packet is delayed by even 20ms it can hurt the call.

Next I checked the memory in use, if any of it was in swap or if there was any wait on the disks as these can be a cause for high load averages, they all came back clean. Next I used top and set top with the delay parameter set to 10 seconds. The reason for this is should there be a jump for a small amount of time I would want enough time to see the processes that were causing the spike. By default top refreshes every 3 seconds. I then did “SHIFT + p” over and over till I saw which processes were causing an increase in CPU usage. There were a few scripts that were running that would randomly spike with a total usage of nearly 800% of the CPU (which is all 8 cores on the box). This in turn caused the load to over the ideal limit of 8 which seems to have been causing the issue. As the scripts were not essential we killed them. With in a few minutes the load average went down to about 2 and the complaints stopped.

Whatever hat we are wearing we can’t have tunnel vision. You need to always look at the whole picture and never assume. As an engineer I let the OSI model take over my thinking and I didn’t start with the possibility that the box was overloaded which caused it to start dropping packets.

We need to get Serious about Security

They say the S in IOT stands for security. Let that sink in….

As developers/hackers/makers or whatever we may call ourselves we need to put our selves in the shoes of the admin/end user. Yes I am lumping in admins and end users. The advancement of FreePBX, Asterisk, FreeSwitch and open source technologies has been a double edge sword. On one hand it has lowered the barrier for entry to many folks. It has given a lot of us job opportunities that we would otherwise have never had. On the other hand it has made it a lot easier for the average IT person to simply download an ISO and have a fully working PBX under an hour. While this is great the people setting up such systems don’t always know much about security. They don’t realize that there are attackers just waiting for an easy system to compromise. We as creators and makers need to try to think for the person that does not know much and educate them. It starts with:

1) admin/admin SHOULD NEVER be the default for a device. In fact devices should always ship with a unique password. California recently passed a law that requires IOT devices to have strong unique passwords. While I am personally against government telling companies how to operate in this case I am very much for it. Stop being lazy!

2) Don’t allow the user to use simple passwords. If your creating a soft switch/PBX don’t allow the user to use username 100 password 100. At the very least by default ship it with a config file that does not allow insecure passwords. The same goes for phones. If a user tries to put in a easy combination at the very least warn them that it’s a bad idea.

3) For IOT devices by default disable the GUI. Shout out to Panasonic for doing this. Many times vulnerabilities are found interfaces and the attackers know this. If a GUI is disabled by default it’s one less way for the attackers to get in. Shout out to Panasonic for shipping their devices this way by default.

4) DON’T ALLOW PASSWORDS TO BE DOWNLOADED IN PLAIN TEXT! In the process of writing this blog post a client was compromised. The PBX appliance was shipped with a default username and password. The attacker simply logged in and downloaded all the credentials for the extensions on the system. Passwords should never be served in plain text even over https!

5) If you are going to offer provisioning at the very least force the end devices to use mutual TLS where they support them. If the device supports encrypting the configuration files do that too. My take on security is a whack a mole approach. You need to block as much as you can wherever you can not relying on any one method.

6) iptables and fail2ban are opensource. Ship with it by default. It may be harder for the users in the beginning but they will be thankful over all.

7) Have a bounty program. Give people an incentive to report vulnerabilities in place of them selling them on the open market. You will retain customers long term. If you don’t eventually they will go elsewhere. My customer that was compromised this morning is planning on replacing 15 of his current PBX’s because of the issues he has had (even though part of it is a ID10T error).

That’s it for now. As new scenarios come up I will try to add to this list. For now remember the S in IOT stands for Security.

/D

Why the FCC’s large fine on Robo Callers is worthless

Robo callers we all hate them! Most of us have gotten those annoying phone calls at the most inconvenient of times. In the beginning it was hard for people to root out these calls. They would have to keep track of the caller ID’s being used and learn to ignore them. Then came the block in the iPhone with other apps such as TrueCaller. The spammers were Read more Why the FCC’s large fine on Robo Callers is worthless

Getting real time CPU stats while benchmarking

As of late I have been spending a lot of time working with a client to benchmark their software. We needed a way to get the cost per core on the system. We would start by limiting the software to one core, run tests and log results repeat with two cores etc. The stats were all over the place. We were not able to explain why certain CPU’s behaved one way and others a different way. We tried looking at the generation chip, the type of ram etc. and we did not have anything conclusive. After digging around a bit I learned from those wiser then myself that the CPU clock speed can vary based on many fractures such as the TDP of the chip, how well the chassis handles heat dissipation etc.  The most accurate way to get the CPU stats is by looking at each CPU’s current performance. For instance if we wanted to look at CPU0 we would do:

root@raspberrypi:~# cat /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_cur_freq
1200000
root@raspberrypi:~#

If we have 4 CPU’s on the box we may want to do this:

root@raspberrypi:~# cat /sys/devices/system/cpu/cpu[0-3]/cpufreq/cpuinfo_cur_freq
600000
600000
600000
600000
root@raspberrypi:~#

Running these commands manually can be hard so if you have a quick eye and want to watch things in real time you can do:

watch -n 0 cat /sys/devices/system/cpu/cpu[0-3]/cpufreq/cpuinfo_cur_freq

In order to make things easier for myself I wrote a quick bash script that checked ever 500ms each CPU and kept stats. This makes it easy to lave it running in the background to get an overall average of how the CPU is performing over a certain Period.

#!/bin/bash
PROC_COUNT=$(nproc --all)
echo "Running as PID $$. To get the stats simply do Control+C"

trap 'sigint' INT
trap ':' HUP # ignore the specified signals

sigint()
	{
	echo ""
	for ((j=0; j<PROC_COUNT; j++)); do
		AVG=$((${CPU_S[$j]} / "${CPU_C[$j]}"))
		echo "Core $j had an AVG USAGE of $AVG"
	done
	exit 0
	}

 

while :
do

for ((i=0; i<PROC_COUNT; i++)); do
	CPU_C[$i]=$((${CPU_C[$i]} + 1))
	MY_CPU_T=$(cat /sys/devices/system/cpu/cpu$i/cpufreq/cpuinfo_cur_freq)
	CPU_S[$i]=$((${CPU_S[$i]} + "$MY_CPU_T"))
	done
	sleep 0.05
done

How to get back and annoy IRS and other scammers

We have all received  calls from the IRS scammers and the like annoying us to no end and trying to scam us out of our hard earned money. Till now there two ways to get back at them.

 

  1. Keep them on the phone thinking they have a fish and waste their time.
  2. Do a LRN lookup on the number, report it and check back to see if the line was taking down.

Both options are gratifying to an extent however the first one only ties up one scammer and the second takes time, some times lots of time. Now there is a better way, we can jam up their lines and annoy them. The added bonus here above the other options is that no one can call them back and get suckered in. All you need is some basic Asterisk skills and a carrier that is willing to take your calls.

This is done in 5 easy steps.

Read more How to get back and annoy IRS and other scammers

Why you should invest in infrastructure

I got yet another phone call today from “card services”. What made this call special was the quality of the recording. After I pick up and say hello twice the audio comes in all broken up. Once I figured out what I needed to press to speak to a representative the call was fine. This indicates that the box that made the call has no issue bridging packets however there is a serious IO problem which is why the sound file being played to me was all broken up. If you are going to annoy people at least do it right. You can hear the call here https://instaud.io/1df6

Are you asking to get hacked?

Every so often when I do a password reset the site will send me my password in plain text. This lets you know that they are storing the password in plain text in their database. They are practically asking to be hacked. IMHO any site that does not use some sort of salt with their passwords probably does not have the best security.  Read more Are you asking to get hacked?

Why tracking how customers come to your site is important

Back in the days of old a domain name was very important. Search engines weren’t what they are today. If you were a cheap toys store you wanted to get cheaptoys.com or something similar. People would search for terms and search engines would show the URL that their systems thought was the closest fit to what you were looking for. That has all changed. While the domain name you chose is important it’s not nearly as important as how you market your site and get people to it (aka Search Engine Optimization). If you Read more Why tracking how customers come to your site is important