Probing questions to ask before troubleshooting..
Scope the problem
When did it start?timeline
Pin the exact time. Did it coincide with a change, reboot, software update, or someone else's work? Time narrows root cause dramatically.
How many users are affected?
One user = endpoint or account issue. A floor = switch or AP. Everyone = upstream, core infrastructure, or ISP. Scope drives where you look first.
Is it constant or intermittent?pattern
Constant faults point to config or failed hardware. Intermittent issues suggest load spikes, flapping links, interference, or marginal cables.
What changed recently?
New hardware, firmware updates, patching, config changes, or physical moves. The most common cause of network faults is a recent change.
Which applications are affected?app layer
All apps = network-level fault. One app = DNS, firewall rule, or service issue. Helps isolate whether the problem is at L3/4 or L7.
Endpoint & device
What OS and device type?
Windows, macOS, Linux, iOS, Android all have different network stacks and driver quirks. Helps target OS-specific fixes and logs.
Wired or wireless?medium
Wired issues = cable, NIC, switch port. Wireless issues = signal, interference, AP capacity, or SSID config. Fundamentally different fault trees.
Does another device on the same port/SSID work?
Isolates whether the fault is in the device (driver, NIC, config) or in the network infrastructure it connects to.
What IP address does the device have?addressing
APIPA (169.254.x.x) = DHCP failure. Wrong subnet = DHCP scope or VLAN misconfiguration. Correct IP moves investigation to routing and DNS.
Has the device been rebooted?
Rules out transient driver or stack state issues. If a reboot fixes it, the problem is likely memory, hung process, or stale ARP/routing cache.
Infrastructure & topology
Where is the affected user located?physical
Floor, building, or remote site? Narrows which switch, AP, uplink, or WAN circuit to examine and whether the issue is site-specific.
What VLAN is the device on?
VLAN mismatch between port and device expectation silently drops traffic. Verify the access port is tagged to the correct VLAN in the switch config.
Any errors on switch/router interfaces?errors
CRC errors, input/output drops, and flaps indicate bad cabling, duplex mismatch, or overloaded ports — surface-level checks before deep dives.
Is the gateway reachable?path
Ping the default gateway first. Reachable = LAN is up, look further out. Unreachable = local segment, cable, switch, or routing table problem.
Can the device resolve DNS?
Many "no internet" complaints are DNS failures with working IP connectivity. Test with nslookup or dig before assuming a routing problem.
Wireless-specific
Which band — 2.4 or 5 GHz?RF
2.4 GHz has longer range but more congestion and interference. 5 GHz is faster but shorter range. Symptoms differ between bands on the same SSID.
What channel and signal strength (RSSI)?
Poor RSSI (below −75 dBm) causes retransmits and slow speeds. Channel overlap with neighbours degrades throughput even at good signal levels.
Which AP is the device associated with?
A sticky client clinging to a distant AP instead of roaming to a nearby one explains poor performance. Check the AP MAC vs nearest AP location.
Security & policy
Could a firewall or ACL be blocking it?firewall
Firewalls silently drop denied traffic. Check rule logs, hit counters, and test from a known-clean host to isolate policy-driven drops from faults.
Is 802.1X / NAC in use?
Authentication failure drops clients to a restricted VLAN. Check RADIUS logs and supplicant status before assuming a switch or cable fault.
Is a VPN active on the device?tunnel
VPN split tunnelling, routing table injection, or DNS override can cause connectivity for specific destinations to fail while everything else works.
Performance & baseline
What is the expected baseline?baseline
You cannot define "slow" without a normal. Ask what speed, latency, or reliability users expect — then measure to see how far current state deviates.
Is the link saturated?
A congested uplink causes latency and packet loss for everyone on it. Check interface utilisation graphs on the switch or router before assuming failure.
What do ping/traceroute show?path
Latency spike or loss at a specific hop pinpoints the failing segment. Consistent loss vs bursty loss points to different root causes — hardware vs congestion.
Has this happened before?history
A recurring issue points to an underlying unfixed root cause. Check incident history and CMDB before applying the same band-aid a third time.
Comments
Post a Comment