Latest Posts

Best of Breed

I recently heard the term “best of breed” used when discussing network vendor selection. I was surprised by this answer because you don’t hear it too often.  The more I thought about it, why not “best of breed” selection? My time as a network and infrastructure supervisor has taught me that a data center environment can be full of different compute and storage vendor products. Our SAN environment consists of Pure, Tegile, EMC, and even QNAP. Each product has its place.  Pure serves the VDI environment, Tegile/EMC host production, and QNAP serves as a target for our Veeam backups. The team has also categorized and carved out each platform into tiered offerings.

On the other hand, network vendor selection tends to be biased.  Typically you’ll see one network vendor selected for the edge/access, distribution, and core. However, you will find a different wireless vendor from time to time.

I’ve seen many reasons for this, so I’ve compiled a list of the most popular reasons I’ve heard:

  • We would like to interact with only one vendor for purchases and support.
  • ABC vendor only works well with a particular management tool.
  • I only know vendor ABC, and we don’t have time to learn something new.
  • Did you hear that vendor ABC had an issue with XYZ product, I don’t want those problems.
  • Everyone else uses vendor ABC.
  • No other vendor supports my VOIP feature set.
  • You can’t do XYZ well or at all with any other vendor product.

I will say that there are a few use cases that keep you tied to a particular vendor that are listed above. However, what’s the harm at looking into something new or trying something different? We shouldn’t be afraid of learning new things.  I’ve done comparisons, demo’s, and have had to choose a different WAN router due to a lack of offerings before.  Management tools can become a difficult topic for discussion, but networking gear tends to stick with long time veteran SNMP for management, so that opens up possibilities.  Even Cumulus Networks, with all its automation buff, still provides legacy SNMP support. However, you wouldn’t want to use SNMP for automation with Cumulus as that would defeat their main business case of choosing their product.

So why not design around different use cases? If you need leaf/spine, don’t build the traditional access/distribution/core.  If you require a robust NAC solution, maybe Extreme Networks at the edge would be worth looking at instead of Cisco. If you require low latency, perhaps look into Arista or Mellanox.  If you’re in the service provider arena, Ciena may be worth looking into.

Being part of the networking field right now is a fun place to be. The variety of vendor selection creates some great competition and interesting niche feature sets.  Be sure to highly consider sticking with open standards if you travel down the “best of breed” road. Just remember to embrace openness and have fun. And yes, Cisco and Juniper have their places as well.

A simple start to project management

Being able to track your work efficiently is a very useful skill. For years I completed my work but rarely tracked my work in a project management professional (PMP) sort of way. Sure, I’ve done the weekly reports, sticky notes, Outlook tasks, and Outlook calendar block scheduling, which are all useful. However, simple project management skills help create a consistent and straightforward approach to managing time, resources, and tasks. I’ve seen organizations take an all guns blazing PM approach to a nothing at all approach. Sometimes you’ll see IT subject matter experts resist PM due to the “I’m too busy” or “It takes too much time” statements, but in reality, basic fundamental project management is not that difficult.

Here’s an example of how you can model behavior and start to implement some basic PM skills. I recently had a team reach out to discuss a new project that would require infrastructure resources. The team pulled up a draft diagram, and we began our dialog. I started to ask peering questions, and the diagram began to transform. Once I was comfortable with understanding what we were trying to accomplish, I shared my screen with the team and opened up OneNote. I began typing each major task that needed to be completed and here’s what I came up with:

Work breakdown schedule (WBS)

Start date

Completion date

Assignment

Additional resources

~hours

Site A Start

Review existing firewall DMZ networks for possible reuse

New External DMZ VLAN creation

Setup new VLAN HP infra chassis/VC setup

Setup new Vsphere virtual switch

New Ext. DMZ subnet/routing (plan for 5 hosts future, + 3 F5 LB)

Request 2 VMs

New Ext. DMZ IP addresses

New Ext. DMZ setup on F5 LB

New Ext. DMZ firewall rules (internal)

F5 – VS creation for (internal)

New Ext. DMZ firewall rules (external)

F5 – VS creation for (external)

External DNS creation

Now all I have to do is work the timeline out and gather my team resources. Then I’ll collaborate with each of my team members to come up with project time availability and make sure that I can fit everything into an expected timeline. Of course, there’s lots more to a project, but this example can provide an excellent introduction to creating work breakdown schedules. This project also has some HA requirements, but we started the testing phase without HA. This table could be duplicated for Site B, C, etc. for the HA build. You can use Excel or tables within OneNote to get started. I would suggest looking at Gantter via www.smartapp.com which is free and easy to use if you like gantt charts. If you’re looking for more sophisticated tools, take a look at MS project, Teamdynamix, or Smartsheet.

Checkpoint CPU optimizations

We recently upgraded some of our WAN link bandwidth capacity from 1Gbps to 10Gbps to decrease transfer rates of backups across our two data centers. Traffic between each site encrypts with Checkpoint physical open appliances. The upgrade to the WAN links involved installing 10Gbps Intel NICs in our Checkpoint open servers. Once all the pieces were in place, I started to test everything using iperf3.

My initial iperf3 TCP results showed a maximum capacity of around 650Mbps. Something seemed to be limiting my ability to push more traffic across data centers. I started looking at the primary site Checkpoint VPN open server gateway. Using top in expert mode, I found that none of the CPU’s were maxed out. However, when I ran tests, one core climbed to 85%. I noted it and moved to the next device in the link. At the other site, the open server was maxing out one CPU core at 100%. Both open servers were running on different hardware with different CPU clock speed, so this made sense that the lower clock speed open server was maxed out. What I didn’t understand is why the other CPU cores weren’t utilized.

The Checkpoint expert command of “fw ctl affinity -l” showed that each NIC was assigned a single core along with multiple firewall kernel instances mapped to individual cores. A top session displayed that the ksoftirqd service was maxing out a single CPU core that was assigned to the 10G NIC interface. After reading through Checkpoint’s performance tuning administration guide, I discovered that I would have to enable multi-queue to assign a NIC multiple CPU cores. The “cpmq get -a” command in expert mode verified that my Intel 10G NIC supported multi-queue and the ability to assign up to 16 rx queues to a NIC. I had eight cores available on each open server. I unassigned a kernel fw instance from a CPU core to free it up for the 10G NIC. I then enabled multi-queue on the interface and assigned two NIC queues to two CPU cores. The reconfiguration required two reboots on the gateway. The first to remove the fw kernel instance from one core and another to enable multi-queue.

Once everything was setup, I was able to push 1Gbps using iperf3 across the VPN tunnel. Top showed that the NIC queues were utilizing two cores. Now I have to upgrade our esxi host lag as I’m running into a limit of a 1Gbps link within the lag. If I run into any Checkpoint CPU utilization issues, I can always create more NIC queues.

 

Running the cpmq get -v command now shows the following:

Active ixgbe interfaces:

eth4 [On]

The rx_num for ixgbe is: 2 (default)

multi-queue affinity for ixgbe interfaces:

eth4:

irq     |       cpu     |       queue

—————————————————–

210             0               TxRx-0

218             2               TxRx-1

 

Running fw ctl affinity -l now shows the following:

eth3: CPU 0

fw_0: CPU 5

fw_1: CPU 7

fw_2: CPU 3

fw_3: CPU 1

fw_4: CPU 4

fw_5: CPU 6

Interface eth4: has multi queue enabled

Extreme Networks – enabling a few things

Some of my most visited posts seem to be on brocade switching config, so I decided to put together our standard list of commands for some Extreme Networks switches we use. These commands can be used on the b5, c5, K series, 7100 series, and S Series Extreme Networks switches. Some commands are self explanatory, but for other’s I added a short description.

This command sets the vlan for the management ip set on the switch.

->set host vlan “vlanid”

->set ip address “ip address” mask “subnet mask” gateway “gateway ip”

We disable cdp on all edge ports without VOIP phones.

->set cdp state disable “port string”

We use the ciscodp command in order to set the tagged voice vlan on Cisco phones.

set ciscodp port vvid “voice vlan” “port string”

We manually configure a small set of vlans for each building, so gvrp isn’t necessary.

->set gvrp disable

->set igmpsnooping adminmode enable

->set igmpsnooping interfacemode “port string” enable

->set maclock enable

->set maclock enable “port string”

We limit the amount of mac’s that can be learned on the port and make it equal to the number of mac authentications we can do per port. Mac auth sessions are limited by switch model type.

->set maclock firstarrival “port string” “number”

->set macauthentication reauthentication enable “port string”

We set the maximum number of mac authentication sessions per port. This is limited based on switch model type.

->set multiauth port numusers 8 “port string”

You can do more than one number of port authentication types. By default we have mac auth, but you can also setup 802.1x auth as well. If you fail 802.1x auth, mac auth will be the next method of authentication.

->set multiauth precedence mac dot1x

->set port broadcast “port string” “pps threshold value”

We clear all the default snmp settings

->clear snmp access ro security-model v1

->clear snmp access ro security-model v2c

->clear snmp access public security-model v1

->clear snmp access public security-model v2c

->clear snmp community

->clear snmp group ro ro security-model v1

->clear snmp group ro ro security-model v2c

->clear snmp group public security-model v1

We setup every device with snmp v3 authentication.

->set snmp access public user “snmpusername” security-model usm

->set snmp user “snmpusername” authentication md5 “auth pass” encryption des privacy “priv pass”

->set snmp viewname All subtree 1

->set spantree spanguard enable

->set spantree adminedge “port string” true

->set ssh enable

->set telnet disable inbound

->set telnet disable outbound

->set webview disable

->set pot alias “port string” “alias”

This command enables POE on an edge port.

->set port inlinepower “port string” admin auto

->set port inlinepower “port string” admin off

Untagged vlan port setup.

->set port vlan “port string” “vlan id”

Tagged vlan port setup.

->set vlan egress “vlan id” “port string”

The one thing that could be better is the implementation of a command to apply the running config to the startup config. All these commands will be automatically applied and saved to the running configuration once entered.

Troubleshooting ARP/IGMP/Router CPU

We recently starting having issues with a building reporting that icmp stopped responding on a distribution router and some access switches behind the router. Some routing interfaces would respond, but the management VLAN interface wouldn’t. Further troubleshooting showed that the CPU processes on the router comprised of two Extreme Networks 7100 series switching running OSPF climbed up to 80/100% utilization. The “show logging buffer” revealed massive amounts of host-dos ARP attack events. The first thought was that a possible infected machine was creating an ARP storm. This would happen about twice a day for about a minute, but not at the exact same time. We tracked down the MAC addresses and removed the PC’s from the network. This didn’t seem to help, as another set of MAC’s addresses would show up in the logging buffer for host-dos ARP attack events the next day. We decided to start running a Wireshark packet capture. We could see the ARP storm along with some other IGMP traffic that would easily consume a Wireshark session, but we couldn’t identify the root cause.

After further investigation of the host-dos ARP logs, we noticed that the source interface should have been in STP blocking mode due to it being a redundant link to the access switch. My thought was that the massive ARP flooding could have been caused by a loop. Why would a loop occur? I then caught the CPU process table during the outage and I found that the IGMP process was consuming the router CPU. Could it be that the issue wasn’t due to an ARP storm, but the ARP storm was a secondary issue to something else going on? We decided to disable the redundant interfaces. This would take the possibility of a loop being created out of the picture. My thought was that the high CPU was causing dropped bpdu’s and the secondary link would go into forwarding on the access switch, but the router being CPU bound was still using the original link which caused the ARP storm/loop.

The issue continued with the redundant links disconnected, but now we weren’t seeing the host-dos ARP logs. Ok, so we knew we had high CPU utilization. We also knew it was the IGMP process. There was a slight traffic correlation on the routing interface before the CPU spike, so I enabled netflow on the upstream core router. Netflow started forwarding data to PRTG (Network monitoring utility). PRTG showed that the top talker was a newly built Landesk server. Now we were getting somewhere. Further research into Landesk revealed that the product uses multicast. My team decided to run another packet capture while booting up a lab and presto, the CPU started to spike on the router. The packet capture revealed a large number of multicast traffic classified to be used by Landesk. The multicast address was 239.83.100.109 along with UDP port destination of 33355 which was defined as Landesk “software distribution”. The flooding of multicast traffic seemed to be the culprit of the high router CPU utilization.

Happy troubleshooting,

@javi_isolis

Some PHP snmp scripting

I was digging through some of my old notes and came across a few networking PHP scripts that I put together for some Proxim AP-4000 access points. I put this script and many others together to help manage these standalone access points before there were wireless controllers. This particular PHP script sets up a while loop to modify some snmp values to modify AP filters. The snmp values within this script can be modified to be used in changing other values as well. Your setup will require PHP installed along with the snmp package. Have fun.

<head>
<title>
AP-4000 Filter modification Script
</title>
</head>

<body>

<?php
//set the variable that will be the start number of the third octet within your IP range
$ip = 100;
//set your snmp RW password
$snmpRwPass = yoursnmppassword;

//setup the loop that will snmpset each AP mgmt IP address defined starting at your IP variable and completing before your max value
while ( $ip <= 111) {

//modifies snmp values of Proxim AP-4000 filters
snmpset(“192.168.1.$ip”,”$snmpRwPass”,”.1.3.6.1.4.1.11898.2.1.5.5.3.1.6.1″,”i”,”1″, “10”);

snmpset(“192.168.1.$ip”,”$snmpRwPass”,”.1.3.6.1.4.1.11898.2.1.5.5.3.1.6.2″,”i”,”1″, “10”);

snmpset(“192.168.1.$ip”,”$snmpRwPass”,”.1.3.6.1.4.1.11898.2.1.5.5.3.1.6.3″,”i”,”1″, “10”);

snmpset(“192.168.1.$ip”,”$snmpRwPass”,”.1.3.6.1.4.1.11898.2.1.5.5.1.0″, “i”,”1″, “10”);

//print output of each AP mgmt IP thats completed
echo “done with 192.168.1.”.$ip;
echo “<br>”;

//counter for increasing AP mgmt IP
$ip++;

}
?>
</body>
</html>

 

Portable home lab virtualization server + gaming

I have a few PC’s that I use for testing, gaming, and other side projects. I wanted to pare down on a few systems, so I started looking into a portable VM home lab server setup that could potentially be used for testing at least four different VM’s and also allow for some decent gaming performance utilizing VM hardware GPU passthrough.

I first pondered on the Intel NUC Skull Canyon. It’s pretty portable, tough looking, and powerful, but it lacked the ability to easily install an external GPU and the ability to easily install a hypervisor when it first launched. It’s also pretty expensive and I was trying to stay around the $500-$600 range. I started looking at a few mini ITX cases and remembered coming across the ASRock M8 Mini ITX design in the past.

asrockm8portableserver

asrock m8 side view

I ended up finding the case, but it was a barebones only bundle that already included an older motherboard that didn’t support the type of passthrough that I was looking to utilize. The case reminded me of the old G4/5 Apple cases with the handles on each corner of the case. The handles on the corners make it a lot easier to carry.  I ended finding a discount open box from Newegg that made the purchase a little more bearable.

Here’s the home lab setup specifications and costs:

Case: ASRock M8 Mini ITX (included LGA 1150 mobo w/pwr supply) open box $186.69
Memory: G.SKILL Aegis 16GB (2 x 8GB) 288-Pin DDR4 SDRAM DDR4 2133 $52.42
CPU: Intel Xeon E3-1225 v5 SkyLake 3.3 GHz 8MB L3 Cache LGA 1151 $234.99
Motherboard: ASRock Server/Workstation MB-C236WSI $213.00

Total

$687.10

I ditched the 1150 motherboard and installed the Intel c236 chipset based LGA 1151 motherboard along with a Xeon Skylake based processor. My first attempt at installing Vmware exsi 6.0 was a failed attempt due to my inability to get anything working with passthrough. I tried multiple versions of Esxi, but I still couldn’t get my ATI 6770, usb, or sound passthrough working. I tried a few other graphics cards, but without sound, I threw the Esxi hypervisor out of the picture. I then decided to try an installation of Xenserver 7 and to my surprise, I was able to pass through all of my components. I did have to manually run some commands to get things going, but in the end I ended up with a VM that could possibly do some decent gaming.

ASrock MB-C236WSI asrock m8 with MB-C236WSI skylake xeon motherboardasrock m8 rear with MB-C236WSI skylake xeon motherboard

In order to get all the passthrough devices working, you may have to do some work within the CLI. I didn’t have to worry about setting up the GPU within the CLI as it was already listed as a passthrough device in the Xenserver GUI management interface.

In order to add other devices, first find your VM UUID once provisioned through the GUI manager. Then run the following command within the Xenserver CLI interface:

lspci -k | more

Find your pci devices. I was specifically looking for USB and sound. If you want to add multiple passthrough devices, you will have to run the next command once with both your pci devices listed within the command along with your VM UUID

xe vm-param-set other-config:pci=0/000:00:1f.3,0/000:00:14.0 uuid=a4f084ae-e8cf-144a-ac31-7bf456e333b5

Why NetDevOps/NetOps will become important for Network Admins

Being a network administrator/engineer typically requires typing in ssh consoles to get things going. At some point, being able to automate tasks or being able to manipulate configurations based on a certain outcome will become necessary. I’ve gathered a few thoughts on real world views to network automation. The buzzword floating around for this topic is NetDevOps.

NetOps/NetDevOps(my definition): Network automation using code to run commands that would normally have to be typed in manually into each device. Example: Run code that can parse or write through configs, logs, and snmp values in order to take action on a specific outcome.

I won’t go into the details of the ins and outs of NetOps/NetDevOps and how to get started with coding. I’ve provided a list of links with information that other really smart people came up with.

Detailed Definitions:
https://cumulusnetworks.com/blog/netdevops-networking-methods-with-a-devops-mindset/
http://packetpushers.net/pull-my-strings-im-your-puppet-juniper-bringing-devops-to-networking/
Some examples:
https://www.nanog.org/sites/default/files/Carr_What_S_Netdevops_Why.pdf
Getting Started with coding:
https://www.nanog.org/sites/default/files/Swafford_Netops_Coding_101.pdf

Ok, now what can NetDevOps actually do for you network administrators out there? I started to create a list of items that NetDevOps could put a dent in. I don’t feel that you require a Google or Facebook sized infrastructure to take advantage of NetDevOps. My team and I currently manage around 120 switching/routing devices and we’re headed to add lots more. That’s no Google, so here’s my list:

  • Changing network admin passwords for devices when staffing changes can be very time consuming. Running some code that can SSH into switches and routers to update passwords and privileges  could be a very useful feature to have.
  • If you have routers that maintain the same ACLs or route maps, having to make changes can be a daunting task even if it’s only a dozen routers. Using code to automatically upload new changes to duplicate ACL’s and route maps across your routers will reduce time and human input errors.
  • I also have a few ideas about gathering wireless user data and plotting details on a google map to indicate AP’s that have a large volume of user connectivity. The map would provide visual information in real time that can help determine if you’re having a sticky client situation. You could also make some automated config changes to your AP’s power levels or with minimum basic rates to help the situation out.
  • You could manage bandwidth available across network links and trigger an automatic response to apply route maps to redirect traffic or apply QOS rules to your programming hearts content.

I’m sure there’s lots of other examples out there, please post others that you may have. I know this last one is heading down the openflow rabbit hole, but hey if you could do these types of things with your current equipment using a NetDevOps approach, why not?

What does a Network Administrator do?

I wanted to share what a network administrator’s daily job duties, functions, and tasks may entail on a daily basis. For those new out there to the realm of IT, a network administrator typically interacts with the hardware/software components that transfer data to and from devices over a physical distance through some type of medium. Some of these devices include: personal computers, laptops, tablets, servers, switches, routers, firewalls, load balancers, wireless access points, and any other devices that rely on transmitting data. The components that are typically managed daily by a network administrator are switches, routers, wireless access points, DHCP/DNS servers, IP address provisioning, documenting/diagraming the network, monitoring bandwidth usage, and maintaining copper/fiber cable plants.

What does a network administrator do daily

Network Admin replacing Enterasys E1 switches.

The scope of what network administrators manage on a day to day basis typically depends on the IT organizational business structure. Organizations that are small or depend on a small IT work force may expect the network administrator to handle more than some of the items listed above. You may be the person who also manages/deploys servers, email systems, desktops, and a host of other things. This typically gives you a plethora of never ending projects that will keep you busy, which will hopefully make you a master of all disciplines. I don’t necessarily see this as a bad thing, but you may not have enough time in the day to architect the network the way you want it or in the best way it could be.

organized cat5 patch cords

Completed fresh installation of Extreme Networks C5k stack.

Some organizations may lump other responsibilities onto the Network Administrator. Telecom and networking is a combination I have seen before. Expect to work with telephony devices and services like VoIP, mass messaging, and voicemail services. Another combination is security and networking which would entail working on the network along with security devices/appliances and coming up with strategies on protecting data. Some networking devices have security features and functions embedded into them such as firewall/routing devices that perform IPS (Intrusion prevention system), IDS (intrusion detection system), spam filtering, and URL filtering/blocking which make networking and security devices easier to manage. I’m not promoting that organizations follow these models, but do expect to see some businesses operate in this fashion.

Larger IT organizations sometimes have the network administrator work specifically with networking devices/appliances. In this environment, you will have the opportunity to thrive by sharpening your skill set into becoming a master of networking. Mostly everything touches a network in today’s world, so always expect to be able to troubleshoot issues that can help other areas solve problems. Your mission should you choose to accept it, is to keep the network up and running efficiently. Just don’t forget to have fun and follow your passion. If you also have thoughts about obtaining certificates,check out this article by network guru Shane Killen, or check out my article on certificates. Please feel free to add other responsibilities that you may have heard that a network administrator does on a daily basis in the comments below.

Checkpoint VPN MEP by default…

I started having issues that required the use of deploying another checkpoint VPN gateway. My team setup the new VM, installed Checkpoint Gaia, and completed the configuration for VPN. I created a new site in my windows checkpoint endpoint security client that pointed to the new DNS entry and off I went. I started to have issues being able to connect to the new VPN gateway after a few days, so I enabled logging in the checkpoint endpoint client. I discovered that my client was trying to connect to one of my original VPN gateways even though I didn’t have the original gateway defined in the VPN client. After a quick call to support, we found out that MEP (multiple entry point) was enabled by default on checkpoint VPN gateway’s that used the same encryption domain. I had to disable MEP, but couldn’t find any settings in the GUI.  The following KB article gives directions on how to disable MEP:

https://supportcenter.checkpoint.com/supportcenter/portal?eventSubmit_doGoviewsolutiondetails=&solutionid=sk78180

MEP wasn’t the desired configuration, but I could see its benefit of being enabled for a redundant VPN gateway setup. I may enable MEP in the future. Only time will tell.

1 2 3 6