The 802.11ax amendment was built to increase the efficiency of WiFi. One of the new features is OFDMA or the capability to send data to several stations simultaneously, multi-user operations.
One of the benefits of OFDMA is to decrease the duration it takes to send the data to several stations in relation to single-user operations. Especially for smaller frame sizes, but for which?
In this blog, I will check the trade-off between SU and MU operations in 802.11ax for three typical scenarios.
You will be amazed, I was
TXOP durations, SU and MU
Let us pretend the AP receives data from the infrastructure that shall be sent till two wireless clients, the same data to both. Because of contention on the wireless channel, the data is temporarily stored in the APs buffer. When the channel became free, the AP will send the data.
In this blog, I will try to find out where either MU or SU operations will use the least amount of duration to send the data to those two clients.
If we look at the TXOP durations for this it will look like this:

The first diagram shows the duration of the transmission using single-user HE frame format for the data frame. The AP needs to send two TXOPs for transferring the data to both clients.
The second diagram shows the same data sent to both clients using multi-user OFDMA operations. This time the AP sends the data to both clients simultaneously in one TXOP.
Common for both operations is that the TXOPs are using AIFS[BE], CW=8, and mcs7 for the data field. Each TXOP in figure 1 sends one symbol of data
If you look closer at the difference in the duration of each part of the TXOP you will see a difference in the HE preamble and the Block Acknowledgement (B-Ack). I explains this later,
The SU TXOP
Let’s look at the SU TXOP. Another view is like this:

When the AP decides to send the data to the clients with SU operation it will send the data frame with HE SU frame format and the BlockAck back from the clients is sent with legacy 802.11a frame format. The details of the HE SU frame format are slightly described in figure 2.
The duration for each part of TXOP can be found in the WiFiAirTimeCalculator
Without any further explanation, the duration for two TXOPs are:
t (2xTXOPsu) = 2*(43 + 72 + 20 + 22,2 + time of data field + 16 + 44) [microsecond], or
2 * (217,2 + duration of the data field).
Remark: BlockAck is sent at 12mb/s.
In figure 1 there is one symbol of data in each TXOP. One symbol is 14,4 microseconds including medium guard interval (1,6), so the shortest amount of time for sending two TXOPs with SU operation is approx 460 microseconds.
It can and will vary slightly depending of the number of spatial streams and other factors
The DL MU TXOP
An 802.11ax downlink multi-user TXOP looks like this:

When the AP decides to send data to the clients with MU operations it will use the HE MU frame format. With this frame format, it will subdivide the channel into resource units and send the data to the clients in parallel. The information on how the channel is subdivided is in the HE preamble, in the HE-SIG-B subfield. This is the reason why the HE preamble has a longer duration during MU operation.
The other key difference in relation to SU operations is that each receiving client/station will send its BlockAcknowledgement (BA) uplink to the AP within the HE Trigger-based frame format. Because this frame format is HE frame format it needs the same HE preamble as other HE frame formats. That is why the transmission of BA in HE takes a longer time than a BA in legacy frame format, but now we send severals of BA simultaneously
A typical DL MU OFDMA TXOP (without RTD/CTS) seen in Wireshark looks like this:

- The first line in figure 4 is the Basic Trigger sub A-MPDU where the AP inform the client on how it shall send its BlockAck
- 11 sub A-MPDU data frames
- The last line is the BlockAck from the client, in a HE trigger-based frame format
Remark: the capturing device is configured to capture traffic for the this specific client/station
The total duration t(mu) = duration of AIFSN[BE] + duration of CW (random backoff timer ) + duration of legacy preamble + duration of HE MU preamble + duration of MU data field + SIFS + duration of legacy preamble + duration of HE trigger-based preamble + duration of the data field for the BA
The last part (SIFS + Multi-STA BA) can be manually calculated, but I will use data from figure 4. The easiest place to look is into the Duration/NAV of the preceding data frame. In this blog, I use the NAV of 130 microseconds
Or in the UL Length of the Basic Trigger frame in the DL data frame. For this, you need to do some calculations.
That is: UL Length *4/3 + SIFS + legacy preamble, or (70 * 4/3 + 16 + 20 = 129,3)
With this setup, two clients and mcs0 for the HE-SIG-B, the HE-SIG-B need three symbols.
Without any more discussions and explaination the duration of a DL MU TXOP are:
t(mu) = 43 + 72 + 20 + 33,2 + time of data field + 130 (microsecond).
t(mu) = 298.2 + duration of the DL data field
Different use of subcarriers in HE SU and MU operations
The use of subcarriers is different between SU and MU operations. On a 20MHz channel, the transmitter will use the full channel during single-user operations. This is 242 subcarriers, where 234 are used for data transfer and the remaining 8 are guard subcarriers.
During multi-user operations and subdividing the channel for two clients the AP uses 106 subcarriers to each client, where 102 of those transfer data and 4 as guard subcarriers. And there is a center 26-tone resource unit (26 subcarriers) not used.
Like this:

Comparing SU and MU operations
In figure 1 we saw that the duration for sending 2 TXOPs in single-user operations took 460 microseconds
And the same data sent in 1 TXOPs with multi-user operations took 312 microseconds
Both methods send one symbol of data
Because single-user operations use more of the spectrum (more data subcarriers) than multi-user operations, the duration of the MU TXOPs will increase faster than the duration of the single-user TXOPs when the data payload increases.
Why: SU used 234 data subcarriers on a 20MHz channel. When the same 20MHz channel is subdivided into two times 106 tones RU, only 102 subcarriers are used for each client. 102 in relation to 234 is 44%.
Therefore will the duration for those operations be more equal for bigger payloads and at some point be equal
Three scenarios
We will look at three scenarios. Common for all of them is that the data is sent with MCS7, AIFS[BE], and a CW of 8. The SU BlockAck is sent with 12mb/s. The duration for the MU BlockAck is picked from a capture.
There is no protection (RTS/CTS) used. The result would have been slightly different, especially when using the MU-RTS frame during MU operations. But not too much.
In the following figures, the payload in bytes is on the x-axis and the duration in microseconds is on the y-axis.
All three start with the duration of 312 microseconds for the MU TXOP and 460 microsecond for the two SU TXOPs. The circle is approx where those two lines cross each other
Scenario 1 is with 20MHz, 1 spatial stream

Scenario 2 is with 20 MHz and 2 spatial streams

Scenario 3 is with 80 MHz and 2 spatial streams which will be a very common scenario in WiFi6E, WiFi in the 6 GHz band.

As we can see, the AP has to send rather big frame sizes before the duration for the MU TXOPs will go over the durations for the two SU TXOPs
At 20 MHz and 1 spatial stream the breaking point is approx 5000 bytes (5 kB)
At 20 MHz and 2 spatial streams the breaking point is approx 10000 bytes (10 kB)
At 80 MHz and 2 spatial streams the breaking point is approx 120.000 bytes (120 kB)
Summary
In this blog, I have compared the time duration between sending data from an AP till two clients by using HE (802.11ax) single-user and multi-user operations. For small frame sizes, the multi-user operation has the shortest duration. Depending of whether the data is sent over 20 or 80 MHz or 1 or 2 spatial streams the breaking point where single-user operation duration will be shorter than multi-user operation will vary.
For me, it was surprising how big payloads the AP needs to send before single user operation will be more efficient than multi-user operations.
But this is only one scenario out of an endless amount of scenarios. This scenario was with the same amount of data sent till two clients on the same mcs and AIFS[n]
I hope this is useful for you. It was for me