VHS digitisation: Theory before practice

Duplication by any means is prohibited

Ripping a bunch of VHS tapes is a most peculiar undertaking.

It’s a project I’ve been waiting to start for several years — a task which (lack of personal experience with PC-based video work aside) I’ve therefore come prepared for:

  • VCR? Check.
  • VHS tapes? Check.
  • Adequate PC? Check.
  • Video grabber? Check.
  • Digital video playback device? Check.
  • Video grabbing and editing software? Erhmm… waitaminnit!

Being late to this particular party, one could’ve expected that much of the rudimentary groundwork had already been covered by others, with the common pitfalls marked out and FAQs on the topic being widely available. Suitable software should be a pound a GB, and there must be many useful, all-in-one programs for what by now must surely be a common task. Self-proclaimed experts should’ve dished out reams of advice.

No, not quite.

It’s a task that’s as simple or as complicated as you make it. I chose the less simple route because I wanted to understand a few things first.

Google searches yield tons of superficial advice on the first page. There’s also useful insight via the Digital FAQ and VideoHelp forums which, initially, leave the novice more confused than when he began, resulting in the installation of a slew of useless software and clashing codecs. The common, and rather obvious, advice is that video should be captured at 720 × 576 px — the highest possible resolution and co-incidentally the same as that of a regular video DVD (as laid down by ITU-R Recommendation BT.601).

The fact that these numbers divide by 16 comes into play later when dealing with compression and macroblocks.

For a much finer level of detail you’d have to go back in history and read through Chris Pirazzi’s excellent Lurker’s Guide to Video  (it’s no coincidence that the European PAL TV system has a refresh rate of 50 Hz — the same as that of AC power).

You need to know this, and you should already know that analogue TV (and VHS, for that matter) is broadcast and/or displayed at 25 fps (frames per second).

Oh, and it’s interlaced, so that the viewer never gets to see the full picture but rather a sequence of 25 half-images: first the “odd rows” on a CRT, then the “even rows” a 50th of a second later. Nor is it the same image or the “other half” of a still frame that you see — the even rows show the events taking place a 50th of a second later. This pairing is called a “field” and has to do with phosphorescence, the return sweep of the CRT electron gun and the human eye’s persistence of vision —  basically an old hack that saves bandwidth and maintains a smooth flow of movement across a total of 625 horizontal lines, of which only 576 contain video information. That’s where the term 576i comes from.

Here’s a screen grab that demonstrates overscanning (the black spaces around the image) and interleaving — the latter being most prominent during scene changes such as this one where Veronica Cartwright is about to get violated by a guy in an Alien suit.

In space, no one can see you're interleaved

Ironically, I won’t be digitising this particular tape because I’ve since bought the movie at least twice on DVD. Commercial video cassettes do, however, provide the most stubborn test environment — no, not because of the MacroVision copy protection scheme but another consideration: the PAR (Pixel Aspect Ratio). In the analogue world there are no pixels, there are scan lines. Those are sampled at the Rec.601 rate of 13.5 MHz and the resulting pixels are not square; instead, they have a PAR of 128/117 which, on a TV, gets filled to a DAR (Display Aspect Ratio) of 4:3.

Confused? Here’s some more maths courtesy of Jukka Aho’s brilliant page:

  • 625/50 systems have a line length of 64 µs, of which 52 µs is the “active” part that contains actual image information. (The rest is reserved for horizontal blanking.)
  • 52 µs × 13.5 MHz = 702 samples (pixels) per scanline
  • In the vertical direction, there are 574 complete scanlines and 2 half lines. Even the half lines get digitized as if their “missing” other half belonged to the active picture, giving a total of 576 scanlines.
  • Thus, the active image area at 13.5 MHz sampling is 702×576 pixels. This is the actual area that forms the 4:3 (or anamorphic 16:9) frame.

Raw, uncompressed video takes up over 1 GB per minute. With a decent codec (MP4/DivX/Xvid/H.264) you could compress a movie down to the size of a CD (or rather a Video-CD) with negligible artifacts but ultimately, your choice of final format does depend on the playback device/s and how much disk space you’re prepared to sacrifice for those converted nuggets.

While my PC can deal with just about any digital video format you throw at it, the same doesn’t hold true for the other three media players I’ve tested at home; each handled different containers with varying degrees of success so that I may end up encoding into multiple formats, depending on where I’m going to store them or how I’m going to view them.

Well, that’s some of the theory so far. Let’s put it to practice.

Image credits: Video grabs by hmvh DOT net

This entry was posted in Media and tagged , , , , . Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *