Seten: How to find the cause and solution of switch failure

Date:November 30th,2023 View:54
Switch is one of the most widely used network devices, and it is inevitable that faults will occur during long-term operation. When faults occur, it is necessary to quickly handle and eliminate them by understanding the type of switch fault and having the ability to analyze and handle faults. Below, the editor has summarized several common faults and solutions of switches for everyone< Br/>

1. Power failure

Due to unstable external power supply, aging power lines, or lightning strikes, the power supply may be damaged or other internal components may be damaged, resulting in the switch not functioning properly< Br/>

If the POWER indicator light on the switch panel is green, it indicates that it is normal; If the indicator light goes off, it indicates that the switch is not supplying power properly< Br/>

In response to such faults, the first step is to provide external power supply by introducing independent power lines to provide independent power supply, and adding voltage regulators to avoid transient high or low voltage phenomena. If conditions permit, UPS (uninterruptible power supply) can be added to ensure the normal power supply of the switch. Set up professional lightning protection measures in the computer room to avoid damage to the switch caused by lightning< Br/>

2. Port failure

This is the most common hardware failure, whether it is a fiber optic port or a twisted pair RJ-45 port, be careful when plugging and unplugging connectors. If the fiber optic plug is accidentally dirty, it may cause contamination of the fiber optic port and prevent normal communication< Br/>

Many people like to plug and unplug connectors with electricity, which is theoretically possible, but unintentionally increases the incidence of port failures. During transportation, it may also cause physical damage to the port. If the purchased crystal head size is too large, it is also easy to damage the port when inserting it into the switch. If a section of the twisted pair cable connected to the port is exposed outdoors, in case the cable is struck by lightning, the connected switch port will also be damaged, causing even more unpredictable losses< Br/>

In general, a port failure is caused by one or several ports being damaged. Therefore, after troubleshooting the computer connected to the port, it is possible to determine whether it is damaged by replacing the connected port. When encountering such faults, the port can be cleaned with an alcohol cotton ball after the power is turned off; If the port is indeed damaged, the only option is to replace it< Br/>

3. Module malfunction

A switch is composed of many modules: stacking module, management module (also known as control module), expansion module, etc. The probability of these modules malfunctioning is very low, and once problems occur, they will suffer huge economic losses. Collisions and unstable power supply during the handling of switches can all lead to the occurrence of such faults< Br/>

The three modules mentioned above are relatively easy to identify, and some can be identified as faults through the indicator lights on the modules. There is a flat trapezoidal port on the stacking module (partially similar to a USB interface); There is a C * OLE port on the management module, which is used to establish a connection with the network management computer for easy management; If the expansion module is connected via fiber optic, there will be a pair of fiber optic interfaces< Br/>

When troubleshooting such faults, first ensure that the power supply to the switch and modules is normal, then check if each module is plugged in the correct position, and finally check if the cables connecting the modules are normal. When connecting to the management module, it is also necessary to consider whether it adopts the specified connection rate, whether there is parity check, and whether there is data flow control. When connecting the expansion module, it is necessary to check whether the matching communication uses full duplex mode or half duplex mode. If it is confirmed that the module has a malfunction, the supplier should be contacted immediately for replacement< Br/>

4. Backboard malfunction

All modules of the switch are plugged into the backplane. Short circuits caused by moisture on the circuit board or damage to components due to high temperature or lightning can cause the circuit board to malfunction. Poor heat dissipation performance of the machine or high ambient temperature can cause an increase in internal temperature, which can lead to component burnout< Br/>

If all internal modules of the switch cannot function properly under normal external power supply, it may be that the backplane is damaged and needs to be replaced< Br/>

5. Cable failure

In theory, such faults do not belong to the faults of the switch itself, but in practical use, cable faults can cause the switch system or ports to malfunction. For example, loose cable joints, incorrect or non-standard sequence during cable production, using straight lines instead of cross lines when connecting cables, interleaving two optical fibers in optical cables, and incorrect line connections leading to network loops< Br/>

Poor computer room environment can easily lead to various hardware failures, so when constructing a computer room, it is necessary to first do a good job in lightning protection grounding, power supply, indoor temperature, indoor humidity, electromagnetic interference prevention, anti-static and other environmental construction, providing a good environment for the normal operation of network equipment< Br/>

6. System error

A switch system is a combination of hardware and software. There is a refreshable read-only memory inside the switch, which stores the necessary software systems for this switch. Like common Windows and Linux, there may have been some vulnerabilities in the design at the timeThe switch system provides methods such as web and FTP to download and update the system, which can lead to situations such as full load, packet loss, and packet errors in the switch. Of course, errors may also occur when upgrading the system< Br/>

For such problems, it is important to develop the habit of frequently browsing the website of device manufacturers. If there are new systems or patches, they should be updated in a timely manner< Br/>

7. Improper configuration

Due to beginners not being familiar with switches, or because each switch has different configurations, configuration errors are prone to occur. For example, incorrect VLAN partitioning can lead to network connectivity, ports being mistakenly closed, and mismatched mode configurations between switches and network cards< Br/>

This type of malfunction is difficult to detect and requires some experience accumulation. If there are no issues with the user's configuration, restore the factory default configuration first and proceed with the configuration step by step. Before configuring, read the user manual first. Each switch has a detailed installation manual and user manual, and there are detailed explanations for each type of module< Br/>

8. External factors

Due to the existence of hacker attacks, it is possible for a host to send a large number of packets that do not comply with the encapsulation rules to the connected port, causing the switch processor to be overly busy, resulting in packets not being able to be forwarded in time, leading to buffer overflow and packet loss< Br/>

Another scenario is a broadcast storm, which not only consumes a large amount of network bandwidth but also a significant amount of CPU processing time. If the network is occupied by a large number of broadcast packets for a long time, normal point-to-point communication cannot proceed normally, and the network speed will slow down or become paralyzed< Br/>

A network card or a port failure can potentially trigger a broadcast storm. Due to the fact that switches can only segment conflict domains and not broadcast domains (without VLAN partitioning), when the number of broadcast packets accounts for 30% of the total communication volume, the transmission efficiency of the network will significantly decrease< Br/>

It is best to cultivate the habit of recording logs in daily work. When a fault occurs, timely record the fault phenomenon, analyze the fault process, plan for fault resolution, classify and summarize the fault, and accumulate relevant experience< Br/>

During configuration, due to various reasons, there was no impact on the network or no problem was found, but perhaps the problem will gradually emerge in a few days. If there is a log record, it can be associated with whether there was a problem with the configuration a few days ago< Br/>

Many times, we tend to overlook this point and think that problems arise in other areas. It is only after taking many detours that we can identify the problem, so it is necessary to record logs and protect information< Br/>

