Typically the virtual WiFi stations that LANforge creates are encapsulated with a virtual router device which holds its own routing table for the set of devices inside it. We do this to isolate the physical host’s routing table from the virtual mobile device we are emulating. This means there are special ways to operate connections through stations. A standard ping will always send packets through the port having the default gateway (which is your management network). You want to send packets to your test network.
Networks
Management: eth0 192.168.1.101/24 gateway 192.168.1.1
Test LAN: <AP> 10.0.0.1/24 running dhcpd SSID testssid
Test WAN: eth1 176.16.0.1/24, running dhcpd
Station: wlan0 10.0.0.175/24 gateway 10.0.0.1 nameserver 10.0.0.1
Ping from the station
To ping from the station you use a version of the command sudo ip vrf exec wlan0 <command>. From a LANforge, you would shorted that a bit and say:
sudo -s
cd /home/lanforge
./vrf_exec.bash wlan0 ping -I wlan0 www.example.com
Ideally what this does is:
- Places the vrf routing table in the ping process space
- ping -I wlan0 forces ping to send packets out wlan0
- ping uses libraries to operate gethostbyname() to resolve http://www.example.com from the nameserver given to the station via DHCP
- sends an ICMP packet to the host
If this DNS server is not reachable or there is no route to that IP from the first response, ping tries harder and starts asking other interfaces if there is a route. This can dodge VRF and start asking on other interfaces. Ping is so persistent that will also check /etc/resolv.conf. This is troublesome because that was probably generated by the host’s dhclient process running on the management network. This can cause DNS queries out to your management network.
Changing resolv.conf
Changing resolv.conf to list the test net nameserver can help. It’s not great for the times when you want to run updates, but its not difficult to spot failed name lookups.
options timeout:1 attempts:1
nameserver 10.0.0.1
Those settings will restrict the amount of time spent by clients doing lookups.
Checking Route Tables
You can check the route tables available with ip route show table all, or grep for your station to see more quickly what route table is assigned to your station. Usually route tables match the number of the station master interface.
root # ip li sh wlan0
22: wlan0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master _vrf17 state UP mode DORMANT group default qlen 1000
link/ether 04:f0:21:3f:00:08 brd ff:ff:ff:ff:ff:ff permaddr 04:f0:21:2a:dc:08
root # ip route show table 17
default via 10.40.0.1 dev wlan0
10.40.0.0/20 dev wlan0 scope link src 10.40.0.75
local 10.40.0.75 dev wlan0 proto kernel scope host src 10.40.0.75
broadcast 10.40.15.255 dev wlan0 proto kernel scope link src 10.40.0.75
Prediction the Route
You can match up the route between a station and a destination using ip route get
root # ./vrf_exec.bash wlan0 ip route get 10.40.0.1 from 10.40.0.75
10.40.0.1 from 10.40.0.75 via 192.168.92.1 dev eth0 uid 0
cache
You might think that looks weird. In fact, I don’t know why eth0 is there at all.
Using Perf to Trace the calls
This is a fascinating way to use perf (which uses eBPF probably) to see what actually is happening when doing a ping:
# ./vrf_exec.bash wlan0 perf trace --no-syscalls --event 'net:*' ping -R -O -c1 -n -B -I wlan0 lanforge.net
PING www.lanforge.net (10.41.0.1) from 10.40.0.75 wlan0: 56(124) bytes of data.
0.000 ping/1880863 net:net_dev_queue(skbaddr: 0xffff88818fffa400, len: 138, name: "wlan0")
0.007 ping/1880863 net:net_dev_start_xmit(name: "wlan0", skbaddr: 0xffff88818fffa400, protocol: 2048, len: 138, network_offset: 14, transport_offset_valid: 1, transport_offset: 74)
0.029 ping/1880863 net:net_dev_xmit(skbaddr: 0xffff88818fffa400, len: 138, name: "wlan0")
From 10.40.0.1 icmp_seq=1 Destination Host Unreachable
--- www.lanforge.net ping statistics ---
1 packets transmitted, 0 received, +1 errors, 100% packet loss, time 0ms
1.133 ping/1880863 net:net_dev_queue(skbaddr: 0xffff888121376ce8, len: 174, name: "eth0")
1.137 ping/1880863 net:net_dev_start_xmit(name: "eth0", skbaddr: 0xffff888121376ce8, protocol: 2048, ip_summed: 3, len: 174, data_len: 108, network_offset: 14, transport_offset_valid: 1, transport_offset: 34, gso_segs: 1, gso_type: 1)
1.140 ping/1880863 net:net_dev_xmit(skbaddr: 0xffff888121376ce8, len: 174, name: "eth0")
Notice how since 10.41.0.1 is not routable (unlike 10.41.0.1) ping digs down to the interface with the default route (eth0) and attempts to use that.
Using getent
Another easy way to see if your configuration is doing what you want is to check how nsswitch is configured by using getent:
# ./vrf_exec.bash sta0000 getent ahosts www.lanforge.net
10.41.0.1 STREAM www.lanforge.net
10.41.0.1 DGRAM
10.41.0.1 RAW
This is much easier to use, if we were paying attention to what was in our /etc/hosts
file, we would have spotted this typo sooner!