IoT and Sensor Networks Course Review Notes
Introduction
1 Definition of the IoT
- Technical Understanding The Internet of Things (IoT) refers to an intelligent network where objects’ information is collected through intelligent sensing devices, transmitted over networks, and processed at designated information centers, ultimately achieving automated information exchange and processing among objects, and between humans and objects.
- Application Understanding The IoT integrates all objects in the world into one network, forming the IoT, which then connects with the existing “Internet” to integrate human society with physical systems, achieving finer and dynamic management of production and life.
- Common Understanding Combining RFID (Radio-Frequency Identification) and WSN (Wireless Sensor Network) to provide services in monitoring, command dispatch, remote data collection and measurement, and remote diagnosis for users in their production and living.
2 Characteristics of the IoT
- Comprehensive Perception Using RFID, sensors, QR codes, etc., to obtain information about objects anytime and anywhere.
- Reliable Transmission Through the integration of networks and the internet, the information of objects is transmitted to users in real-time and accurately.
- Intelligent Processing Utilizing computing, data mining, and artificial intelligence technologies, such as fuzzy recognition, to analyze and process massive data and information and intelligently control objects.
3 Conceptual Model of the IoT
Perception (sensing layer), Transmission (network layer), and Computing (application layer)
- Sensing Layer: Identifies objects, collects, and captures information through methods such as RFID and cameras, laying the foundation for the IoT’s comprehensive perception. Requires more comprehensive and sensitive perception abilities, low power consumption, and solutions for miniaturization and cost reduction.
- Network Layer: Connects the sensing layer with the application layer, achieving ubiquitous connectivity, which is the most mature aspect at present; consists of access networks, core networks, and service platforms. Requires the ability to expand and manage operations everywhere, business scalability, and a simplified structure to integrate layers.
- Application Layer: A collection of solutions for widespread intelligent applications; application areas include smart homes, electricity, transportation, etc. Requires deep integration of information technology and industries, social sharing and security of information, and cloud computing-based application support.
4 Main Characteristics of Sensor Data
- Massiveness: Assuming each sensor sends back data only once a minute, 1000 nodes would produce about 1.4GB of data per day.
- Diversity: Ecological monitoring systems (temperature, humidity, light); Multimedia sensor networks (audio, video); Fire navigation systems (structured communication data).
- Correlation: Data describing the same entity has temporal correlation (temperature changes over time at the same node); data describing different entities has spatial correlation (temperature and humidity readings are similar within the same area); data dimensions of an entity also have correlation (temperature and humidity readings at the same node and time are related).
- Semantics: Data is endowed with meaning by humans, facilitating use.
5 Wireless Sensing Methods
- Traditional sensing: various sensors
- Intelligent wireless sensing/sensing without sensors: WiFi, Bluetooth, ZigBee, OWB, RFID
- Crowdsensing: crowdsourcing, Baidu Maps
Wireless Local Area Networks
1 Structure of Wireless LANs
- Station/Wireless Access Point (AP): AP is the core device of a wireless LAN, providing both wired and wireless interfaces to connect workstations and network servers.
- Wireless Medium
- Distributed System (DS): DS connects different BSSs (Basic Service Sets) enabling workstations to move between BSSs and achieve roaming.
- Terminal
In the North American standard IEEE 802.11 b/g, there are a total of 11 channels, among which channels 1, 6, 11 are non-overlapping transmission channels.
2 Classical Issues in Wireless LANs
Characteristics of wireless information transmission:
- Electromagnetic waves emitted by a wireless user disperse in all directions
- All wireless users within a certain range share the transmission channel
- Wireless communication has a coverage area.
Classic Problems of Wireless Networks
Hidden Terminal Problem: Three entities, A and C both think B is idle and send to B, resulting in a collision. RTS and CTS solve this (including source address, destination address, and communication time) Exposed Terminal Problem: Four entities, A sending to B does not affect C sending to D, but C hesitates;
The RTS/CTS mechanism can solve the hidden terminal problem but cannot solve the exposed terminal problem:
- Before data transmission, a consensus on data transmission is reached with the receiving node through the RTS / CTS handshake method, which also notifies the neighbors of the sending and receiving nodes about the upcoming transmission.
- Neighbor nodes, upon receiving RTS / CTS, suppress their own transmissions for a period of time to avoid causing collisions with the upcoming data transmission.
- This way of solving problems comes at the cost of additional control messages.
RTS/CTS cannot solve the exposed terminal problem because RTS frames do not have high priority, and the presence of data packets can conflict with RTS/CTS frames. Below is one scenario.
3 CSMA/CD Protocol
CSMA/CD protocol, Carrier Sense Multiple Access with Collision Detection, can be summarized as: listen before transmitting, keep listening while transmitting, stop on collision, and delay retransmission. CSMA/CD protocol is not suitable for wireless LANs.
Reasons CSMA/CD is not suitable for wireless LANs:
- Wireless LAN devices cannot achieve the CSMA/CD protocol requirement of continuously monitoring the channel while transmitting data (half-duplex).
- Even if we could implement collision detection functionality, and the channel appears idle while sending data, a collision can still occur at the receiving end (hidden terminal).
- A collision at the local node does not necessarily mean a collision at the receiving end (exposed terminal).
4 CSMA/CA
With Carrier Sense, wait until the current transmission is entirely finished if the medium is busy. Listening methods include physical carrier sensing (signal strength judgment) and virtual carrier sensing protocol (notifying the duration the channel will be occupied).
CSMA/CA flowchart needs to be understood.
There are two methods for collision avoidance:
4.1 Priority Acknowledgment Protocol
Inter Frame Space (IFS): All stations must wait a very short time (continue listening) after completing a transmission before sending the next frame.
Priority: The length of the Inter Frame Space depends on the type of frame the station wants to send. Frames with high priority require less waiting time, thus obtaining the right to transmit first.
Type | Time | Included Frame Types | Note |
---|---|---|---|
SIFS Short IFS | Shortest | ACK frame, CTS frame, data frames resulting from MAC frame fragmentation, and all frames responding to AP probes | Shortest, highest priority |
PIFS Point Coordination IFS | + slot | Coordinated by AP | |
DIFS Distributed Coordination IFS | + 2 slots | Used to send data frames and management frames,RTS frames in DCF mode | Longest, distributed coordination |
4.2 Random Backoff Algorithm
Even with the same priority, there might be contention. When the channel changes from busy to idle, any station wishing to transmit data frames must not only wait for a DIFS interval but also enter a contention window and calculate a random backoff time before attempting to access the channel again.
When the network load is heavy, the smaller the contention window, the closer the random values chosen by the nodes, which leads to too many collisions; when the network load is light, the larger the contention window, the longer the nodes wait, resulting in unnecessary competition. The system should adapt to the current number of nodes wishing to send. Exponential backoff algorithm: The contention window is initialized to the minimum value, and the window is increased in case of a collision until it reaches the maximum value.
5 MAC Layer Functions
The MAC layer must implement DCF distributed coordination (where each node determines access time on its own), and choose PCF point coordination function (coordinated by AP, such as taking turns). Both DCF and PCF can provide parallel competitive and non-competitive access within the same BSS (Basic Service Set, which includes an AP and several stations, multiple BSSs can be connected in series via the routing system to form an extended BSS).
Main functions of the MAC layer:
- Media access control
- Joining network connections
- Data verification and confidentiality
Conversion between decibels and power
$$ dB=10log_{10}{P} $$
According to national standards, the maximum power of routers should not exceed 100mW, which is about 20 decibels
6 Zigzag
Transmit twice, each time with a random Δ time difference, transmitting two packets in two time slots, and can consequently restore the two data packets, equivalent to the packets not having collided. Based on two characteristics:
- Sending with a collision will surely lead to retransmission
- The position of each collision is different
To avoid errors in the analysis process leading to a domino effect, one can analyze from the back and adopt the second data packet if the two packets are the same; if they are different, it signifies an error, and the AP chooses the one with higher PHY confidence level.
Zigzag advantages
- Can use 802.11 standard decoders without modifying its protocol
- Zigzag includes situations with multiple conflicting packets, without introducing external overhead in the absence of conflicts
- Zigzag requires changes to the AP point, not to clients
Wireless Sensor Networks
1 WSN (Wireless Sensor Network)
Wireless sensor network systems typically include sensor nodes, aggregation nodes, and management nodes. It is a large-scale, self-organizing, dynamic, reliable, application-related network.
1.1 Structural composition
The structure of sensor nodes is as follows, the operating system includes TinyOS (flexible but hard to get started) and TI based on the Zigbee protocol stack (vice versa);
- Sensor module
- Processor module
- Wireless communication module
- Energy supply module
The difference between sensor networks and wireless networks: The primary goal of sensor networks is energy saving; devices in wireless networks can move while nodes in sensor networks are mostly stationary (but prone to faults).
1.2 Characteristics of nodes
Limitations:
- Limited power energy (communication modules consume the most energy, with transmitting, receiving, idle listening, and sleeping being the four states of communication)
- Limited computing and storage capabilities
- Limited communication capabilities
Features:
- Adjacent nodes have similar data (can be used for optimization)
- Sensor nodes do not have global IDs
1.3 Antenna length
When using radio communication, one basic condition needs to be met, namely, the antenna size should be greater than one-tenth of the wavelength for the signal to be effectively transmitted. In practical use, it is advisable to modulate low-frequency waves to high frequencies.
Three types of signals transmitted by antennas: ground waves, sky waves (reflected by the ionosphere), and direct (above 30MHz).
Antenna communication distance:
$$ d= 3.57\sqrt{Kh} $$
where (K=\frac{4}{3}) is the refraction constant.
The maximum propagation distance between two antennas:
$$ d= 3.57(\sqrt{Kh_1}+\sqrt{Kh_2}) $$
2 Architecture of Sensor Networks
Implosion and Overlap Phenomena
Implosion Nodes forward data packets to neighbor nodes, regardless of whether they have received the same ones before, i.e., information implosion refers to the phenomenon where nodes in the network receive multiple copies of the same data.
Overlap Sensing nodes have overlapping perception areas, leading to data redundancy, that is, due to dense deployment of sensor nodes in wireless sensor networks, in the same local area, if several nodes respond the same way to the same event within the area, the information perceived is similar in nature and identical in value. Therefore, the copies of data received by neighboring nodes of these sensors also have a high degree of correlation.
2.1 Classification of Sensor Networks
Proactive Networks-Continuously Operating Model
- Nodes regularly activate sensors and transmitters, sense the environment, and send out data of interest.
- Suitable for applications requiring regular data monitoring.
Reactive Networks-Query-Response Model
- Nodes immediately respond to query commands sent by users.
- Nodes immediately respond to changes in certain network attribute values.
2.2 Classification of Sensor Network Architecture
September 22
2.2.1 Hierarchical Architecture
Disadvantage: Nodes near the base station consume energy quickly, creating energy holes.
2.2.2 Clustered Architecture
LEACH Protocol: Low Energy Adaptive Clustering Hierarchy, clusters form spontaneously, and cluster heads are autonomously elected.
PEGASIS: Unlike LEACH, where each node can only communicate with the head, leading to high costs if some intra-cluster nodes are far from the head. PEGASIS optimizes this by allowing nodes to communicate through chains with their nearest neighbors. Slide 29/37
Advantages: Any message at most two hops, head distributed election realizes balanced energy consumption.
2.2.3 Direct Transmission
All nodes directly send data to the base station, large energy consumption, the base station has to deal with conflicts.
2.3 Data Distribution and Data Collection
The above architectures are for data collection, aiming to minimize energy consumption and data transmission delay, using energy*delay as a metric of algorithm performance.
Data distribution is the process of routing query packets and data packets in sensor networks. The most direct method is flooding, where every non-target node broadcasts received packets with non-zero TTL. This protocol is simple, requiring no complex topology maintenance or route discovery algorithms, but it can lead to implosion, data overlap, and blind resource consumption issues.
3 Positioning Technologies
WSN Positioning Classification
3.1 Range-Based Positioning
- Signal Strength RSS
- Based on Timing TOA/TDOA/RTOF
- Based on Angle AOA
Time Of Arrival requires synchronization between transmitter and receiver; Time Difference Of Arrival introduces an ultrasonic module, eliminating the need for synchronization; Round Travel Of Flight is a compromise between the two, not requiring synchronization or specific hardware, but its accuracy is less than TDOA;
Range-based measurement of physical quantities, high cost, high accuracy Examination format: Given scenarios, data, problems, write out plans, calculate distances and coordinates
3.2 Non-Range-Based Positioning Technologies
Also known as distance-independent, it does not require the measurement of physical quantities.
Anchor points: locations already known.
Hop distances: the average distance per hop.
3.2.1 Centroid Algorithm
Beacon nodes periodically broadcast beacon packets to nearby nodes, containing the beacon node’s ID and location information. When an unknown node receives beacon packets from different beacon nodes and the number exceeds a certain threshold k or after a certain time, it determines its own location to be the centroid of the polygon formed by these beacon nodes.
$$ X,Y=\frac{1}{n}\sum_{i=1}^kX_i,\frac{1}{n}\sum_{i=1}^kY_i $$
3.2.2 DV-HOP Algorithm
October 11
Nodes with unknown locations depend on nodes with known locations to calculate their own positions.
The algorithm requires determining the minimum number of hops and the average distance per hop.
Why minimum number of hops? Because it reduces cumulative error and the distance is closest to a straight line.
How to determine the average distance? Based on the estimation of nodes with known locations. Slide 19/73 Three methods: Using only the nearest node (first received, unreasonable), average, weighted.
How to determine the minimum number of hops? Slide 16/73
Suitable for networks with many anchors, uniformly distributed.
Examining the calculation of average hop distance
3.2.3 APIT
October 13
Nodes communicate with neighbors and simulate the motion process, then approximately determine whether they are inside or outside a triangle based on the PIT criterion, slides 8-9, repeating this process multiple times to determine the overlapping area of multiple triangles, taking the centroid as the location, slide 12.
Disadvantages: there can be misjudgments; unable to locate when the number of nodes is too small (≤3); requires a certain coverage rate and distribution of nodes; based on signal ranging, it is only applicable in open fields; distance and signal strength are not perfectly correlated.
3.3 Other Technologies Related to Localization
3.3.1 Sequential Localization
Nodes sort the sequence of signals received from anchors, and determine the common area through multiple perpendicular bisectors.
Another method is to determine and sort the sequence of neighbors, calculate the similarity (feature distance), and then correct the feature distance between the two nodes. If the similarity between two nodes is high, then they are close to each other (logical distance, feature distance).
In this method, each node has a different number of neighbors. How to calculate similarity (under conditions of different dimensions)? The sequence of node pairs is used as the measurement criterion to calculate similarity, which can be explicit, implicit, or possible.
$$ SD=F_e +F_i + \frac{F_p}{2} $$
$$ RSD=SD* \frac{\sqrt{K}}{K*(K-1)/2},K=|S_i \cup S_j| $$
Example of calculating feature distance
Advantages of RSD method:
- Increased accuracy
- Can achieve accuracy per hop
- Efficient: no flooding, two nodes exchanging sequences
- Low computational complexity
- Can be centralized or distributed
The disadvantage of RSD still lies in the fact that signal strength and distance are not perfectly correlated.
4 Time Synchronization Mechanisms in Sensor Networks
October 20, 2023
The role of time synchronization: Localization, data fusion, sleep wake-up (energy-saving! environmentally friendly!)
Factors affecting the transmission delay in time synchronization: slide 5/117 for sending, access (the most uncertain, slide 23/117), transmission, propagation, reception, receiving
- Send time: The time taken by the sender to assemble and deliver the message to the MAC layer.
- Access time: The time from when the sender’s MAC layer gets the message until it acquires the right to send on the wireless channel. This is the most uncertain factor, depending on network load.
- Transmission time: The time taken by the sender to transmit the message.
- Propagation time: The time taken for the message to travel from the sender to the receiver.
- Reception time: The time taken by the receiver to receive the message.
- Receiving time: The time taken by the receiver to process the received message.
4.1 NTP
Synchronization here requires only the calculation of the time difference, solved by the equation
$$ offset = \frac{(T_2-T_1)-(T_4-T_3)}{2} $$
As you can see from this formula, the time difference is independent of the server or client’s processing time.
Reasons why the protocol used in computer networks is not suitable for sensor networks:
- The probability of sensor networks’ links being disrupted by the environment is high.
- The network structure (topology) of sensor networks is unstable.
- NTP servers cannot be realized through the network itself.
- NTP information exchange is frequent, resulting in high energy consumption.
4.2 RBS Class
Reference Broadcast Synchronization, multiple nodes receive the same synchronization signal, then synchronize among several nodes that received the synchronization signal (multiple times, using the least squares method to reduce error). This algorithm eliminates the time uncertainty of the synchronization signal sender.
Principle: The reference message does not need to carry the local time of the sending node. RBS protocol will broadcast the time synchronization message, calculate the average of the time differences between each message’s arrival, thus minimizing the impact of non-simultaneous records.
Advantages: Time synchronization is separated from the MAC layer protocol, free from limitations, good interoperability.
Disadvantages: High protocol overhead.
Sender does not need to write the time.
To reduce the error in time propagation, statistical techniques can be used, simultaneously broadcasting multiple time synchronization messages, calculating the average of the time differences between message arrivals.
4.3 TPSN
Adopts a hierarchical structure, all nodes are logically ranked according to the hierarchy, and each node synchronizes with a node from the previous level (NTP).
Principle/Idea:
- Implement synchronization using a hierarchical structure
- Nodes are logically graded according to the hierarchical structure, indicating the distance to the root node
- Based on a sender-receiver pair method, each node synchronizes with the node one level higher
Root node: Communicates with the outside world and obtains the time, serves as the clock source for the entire network system
Process:
- Hierarchy formation: The root node is at level 0, and a level i node can communicate with at least one level i-1 node
- Time synchronization: Level 1 nodes synchronize with the root node, and level i nodes synchronize with level i-1 nodes
Problems: Accumulation of synchronization errors; lengthy network synchronization time; possible collisions during synchronization between two adjacent layers
Issues: Error accumulation, competition problems (solved by random waiting), the entire network takes a very long time to synchronize
Optimization: Time stamps are added to messages only at the start of transmission at the MAC layer to eliminate access errors
Industrial Internet
What is the Industrial Internet
What is Digital Twin, the Five-Dimensional Model
1 Industrial Internet vs. Traditional Consumer Internet
PPT 36/117 The Industrial Internet is an evolution and upgrade targeted at the real economy, built upon the foundation of the Internet.
2 Relationship between Industrial Internet and Industry 4.0
Industry 4.0 originated in Germany, with a focus on the intelligence and digitization of production and manufacturing processes.
The Industrial Internet originated in the United States, leaning more towards using internet technology to improve production equipment and product services.
3 Five-Dimensional Model of Digital Twin
- Physical Entities: Various subsystems and sensors
- Virtual Entities: Mapping of physical devices
- Services: Optimizing physical devices, correcting virtual devices
- Network Connection: Keeping physical devices, virtual devices, and services interactive during operation
- Twin Data: The driving force behind the operation of physical devices, virtual devices, and services
4 Specific Content of Made in China 2025
In terms of significance, Made in China 2025, compared to Industry 4.0 and the Industrial Internet, has a more explicit goal, more definite connotation, and a clearer path.
Theme: Promote the innovative development of the manufacturing industry
Core: Improve quality and efficiency
Main focus: Accelerate the deep integration of new-generation information technologies and the manufacturing industry
Main direction: Advance smart manufacturing
Goal: Meet the needs of economic, social development, and national defense construction for major technological equipment
5 Ultimate Question
What is the relationship between the Internet of Things (IoT), Big Data, Cloud Computing, and Artificial Intelligence?
- IoT devices generate a large amount of data, serving as part of the source of big data
- Cloud computing provides massive computing and storage resources for processing and analyzing big data
- Big data and cloud computing respectively provide ample training samples and computing resources for the learning of artificial intelligence
- IoT offers a broader application scenario for artificial intelligence, such as smart homes, intelligent transportation, etc.
- In summary, IoT, big data, cloud computing, and artificial intelligence support each other
Level 20 Exam Questions (partial)
1 Short Answer Questions
Randomly pick a few from the review outline
2 Analysis Questions
- Why is the CSMA/CD protocol not suitable for wireless LANs?
- The two main classifications of positioning technology and list some algorithms for each.
- Analyze and compare TPSN and RBS.
3 Comprehensive Questions
- Provide a network topology diagram with eight nodes, the first question asks to write out the neighbor node sequences for five of the nodes; the second question requires calculating the RSD between two nodes.
- Complete the cloze for a red box similar to the one in the image below, requiring the writing of English abbreviations and explanations.