Stderr and Stdout: Understanding Logs and Outputs

Standard output (stdout) and standard error (stderr) are two concepts that, while simple, play a core role in log recording, error handling, and data stream management. This article will explore the differences and applications of stdout and stderr, especially how to use them effectively in a Python environment.

Standard Output (stdout) and Standard Error (stderr)

In most operating systems, standard output and standard error are two main output streams of a process. They provide a mechanism for the process to send information and error messages to a terminal or file. Although these two streams might be physically the same (for example, both displayed on the same terminal interface), they are used for different purposes logically:

  • Standard Output (stdout): Typically used to output the results of program execution or normal running information.
  • Standard Error (stderr): Specifically used for outputting error messages or warnings, which are usually intended to be seen or recorded even when standard output is redirected.

In Python, the print function by default sends information to stdout, while the logging module sends log messages to stderr by default. This is done to differentiate normal program output from log (including error and debug information) output, making it easier for developers to manage and filter output information.

Using print

print is the most basic output function in Python, used to send information to the standard output stream. It is simple to use and suitable for quick debugging or displaying information to users. For example:

1
print("Hello, world!")

Using logging

The logging module provides a flexible framework for adding log messages in applications. Unlike print, logging supports different log levels (DEBUG, INFO, WARNING, ERROR, CRITICAL), allowing developers to adjust the detail and output location of logs as needed. For example:

1
2
3
import logging

logging.error('This is an error message')

tqdm and stderr

In complex or long-running programs, using a progress bar is an effective way to show process progress to users. The tqdm library in Python is a widely used tool for adding progress bars in the command line. tqdm outputs progress information to stderr by default, to avoid interfering with normal program output (stdout).

Diverting stdout and stderr

In some cases, separating normal output from errors or log messages, for example, redirecting them to different files or terminals, is beneficial. In the command line, this can be achieved using the redirection operators > and 2>. In Python code, more fine-grained control can be achieved by configuring the logging module or using specific file objects.

1
python script.py > output.log 2> error.log

Through command line redirection, Python’s print function, or even the logging module, it is possible to flexibly control and divert these two types of output, making error handling, logging, and user interaction clearer and more orderly.

Managing stdout and stderr with nohup

When deploying long-running background processes, the nohup command becomes an important tool. nohup, or “no hang up,” allows commands to continue running after the user logs out, which is especially useful for remotely initiated tasks. A key feature of nohup is its ability to manage stdout and stderr.

By default, running a command with nohup will merge and redirect stdout and stderr to the nohup.out file unless otherwise specified. This means that both regular output and error messages are captured in the same file, facilitating later review. However, in some cases, it may be more useful to separate these two outputs.

Using nohup to separate stdout and stderr

To output stdout and stderr to different files while using nohup, you can combine redirection operators. For example:

1
nohup python script.py > output.log 2> error.log &

This command redirects stdout to output.log, stderr to error.log, and runs in the background with &. Thus, even if the terminal or SSH session is closed, the program will continue to run, and its output will be properly recorded.

Buffering behavior in Python

stdout and stderr behave differently when buffering data. By default, stdout is line-buffered; when connected to a terminal, it caches data until it receives a newline character or the buffer is full; in non-interactive mode, stdout is block-buffered (like files). Meanwhile, stderr is always line-buffered (before Python 3.9 version, in non-interactive mode it was block-buffered). The following content is from the official documentation sys — System-specific parameters and functions — Python 3.12.2 documentation

When interactive, the stdout stream is line-buffered. Otherwise, it is block-buffered like regular text files. The stderr stream is line-buffered in both cases. You can make both streams unbuffered by passing the [u](<https://docs.python.org/3.12/using/cmdline.html#cmdoption-u>) command-line option or setting the [PYTHONUNBUFFERED](<https://docs.python.org/3.12/using/cmdline.html#envvar-PYTHONUNBUFFERED>) environment variable.

Changed in version 3.9: Non-interactive stderr is now line-buffered instead of fully buffered.

The smaller the buffering granularity, the more timely the output, but the higher the IO cost. Before Python 3.8, stdout and stderr had the same buffering granularity, which was not quite reasonable; after version 3.9, stderr received a smaller buffering granularity, meaning that each write operation’s output would be more immediate than stdout. This difference makes stderr suitable for error and log information, ensuring that these messages have a higher priority than standard output even if the program crashes or exits abnormally.

In C++, standard error is unbuffered (see later), more aggressive, but I personally think this makes more sense.

Fortunately, in Python, you can disable this buffering behavior by using python -u or setting the PYTHONUNBUFFERED environment variable, or by directly manipulating sys.stdout.flush() to control the timing of output.

Performance in Python concurrent environments

When using stdout and stderr in a multi-threaded or multi-process environment, the output may become interleaved or chaotic, as outputs from different threads or processes may interfere with each other when written to the terminal or file. One way to solve this problem is to create independent output files for each thread or process, or to use thread locks or process synchronization mechanisms (such as multiprocessing.Lock) to synchronize access to stdout or stderr.

Controlling stdout and stderr in Python

In complex applications, you may need to more flexibly control the destination of the output stream. Python provides several ways to achieve this:

  • Redirecting stdout and stderr: You can redirect the standard output and error output of a Python program by changing the values of sys.stdout and sys.stderr. This is especially useful for capturing and analyzing output or redirecting output to non-standard output devices such as graphical interfaces.
  • Using the subprocess module: When running external commands or scripts, the subprocess module allows you to control the stdout and stderr streams of the command, including redirecting them to variables within the Python program or separating or merging them.
  • Advanced applications of the logging module: Python’s logging module supports logging output to multiple destinations, including files, standard output, networks, etc. By configuring different log handlers, you can implement complex log management schemes, such as diverting logs to different outputs based on log level or message content.

Recommendations

  • Manage output carefully: When designing software, clearly distinguish between outputs for user interaction (stdout) and outputs for error reporting or log recording (stderr). This helps improve the usability and maintainability of the program.
  • Optimize performance: Consider the performance impact of output operations, especially in scenarios of high-frequency logging or data output. Proper use of buffering and batch processing can reduce the impact on performance.
  • Considerations for security: Appropriately filter and desensitize before outputting sensitive information to avoid exposing sensitive data through logs.

By deeply understanding and flexibly applying stdout and stderr, you can build more robust and manageable Python applications, effectively handle logs and outputs, and enhance user experience and application stability.

Buffering Behavior in C++

In C++, stdout (usually corresponding to std::cout) and stderr (corresponding to std::cerr) have different buffering strategies:

  • std::cout is line-buffered by default, which means that when it is connected to a terminal, the output is flushed at each newline or when the buffer is full.
  • std::cerr is unbuffered by default, so data written to std::cerr is output immediately, which is very useful for reporting error information as it reduces the risk of losing error messages due to program crashes.

Redirecting stdout and stderr

In C++ programs, there are various ways to redirect stdout and stderr. A common method is to use the freopen function to redirect standard output or error output to a file during program execution:

1
2
freopen("output.txt", "w", stdout);
freopen("error.log", "w", stderr);

This method can be used to direct output to a file, which is convenient for later analysis and debugging.

Using in a C++ Multithreaded Environment

When using std::cout and std::cerr in a multithreaded C++ program, you may encounter race conditions that result in disordered output. To avoid this, it is recommended to use mutex locks (such as std::mutex) to synchronize access to these streams:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
#include <iostream>
#include <mutex>
#include <thread>

std::mutex cout_mutex;

void thread_function(int id) {
    std::lock_guard<std::mutex> lock(cout_mutex);
    std::cout << "Thread " << id << " is running\\\\n";
}

int main() {
    std::thread t1(thread_function, 1);
    std::thread t2(thread_function, 2);

    t1.join();
    t2.join();

    return 0;
}

Controlling Output in C++

The C++ standard library provides std::streambuf, which can be used to implement finer-grained control over std::cout and std::cerr, including redirection and custom buffering behaviors. By inheriting std::streambuf and overriding the relevant member functions, you can create custom buffering strategies or redirect output to GUI components, network connections, etc.

Recommendations

  • Make rational use of buffering: Choose an appropriate buffering strategy according to the application scenario. For error messages that need immediate feedback, use std::cerr or manually flush std::cout.
  • Avoid using standard output directly in multithreaded environments: Use mutex locks or other synchronization mechanisms to ensure the consistency and order of output.
  • Use redirection and custom streambuf: To handle output more flexibly, consider using redirection or custom streambuf for special output needs, such as logging, network transmission, etc.

By mastering these advanced techniques, you can effectively manage and control the output of C++ programs while ensuring their robustness and flexibility.

Buy me a coffee~
Tim AlipayAlipay
Tim PayPalPayPal
Tim WeChat PayWeChat Pay
0%