How does USENET news work?

People from around the world post messages (articles) to the various newsgroups. Their news system then distributes these articles to neighboring machines. Neighboring machines distribute the articles to their neighbors, and so on, until the articles have propagated all the way around the world. Machines check the incoming articles to see if they already have a copy, and quietly delete duplicates.

Historically, a program like cnews was responsible for the on-disk management of the news articles, and performed two major operations:

  1. Get the news articles, and store them on disk
  2. Delete old news articles.

Let's look at a typical system. As each article arrives (whether by UUCP, NNTP, or some other means), the article's "header" portion is scanned, and the news software determines where (i.e., in which newsgroups) that article should be stored.

A long time ago, when there wasn't all that much news traffic, it seemed like a good idea to just store one article per file. The newsgroup names got converted into pathnames, and everything was simple. For example, if I had an incoming article for comp.os.qnx, I would pick the next article number for that newsgroup (say 1143), and store the new article in a file called /var/spool/news/comp/os/qnx/1143. (The /var/spool/news part is just the name of the directory where all of the ncoming news articles liveā€”it's up to each site to determine which directory that is, but /var/spool/news is common.)

The next article that came in for that newsgroup would go into /var/spool/news/comp/os/qnx/1144, and so on.