Genotrance

Icon

Random thoughts, ideas and experiences

Automating the Encoder

I recently shared my solution for encoding home videos I record using a DVD camcorder. While the solution helped me understand the encoding process, it was not exactly user friendly in terms of the number of steps involved. Being the automation freak that I am, I have written a Python script that takes care of the entire process in a single step.

The Main Issues

The main issues with the original solution were as follows:-

Copying the VOB files to your hard disk

Copying an entire DVD to your hard disk is no joke. It takes up a lot of space and is time consuming and boring. This had to be done since VOBMerge didn’t like to operate on VOB files directly on the DVD.

Using DVD Decrypter for corrupt VOBs

Corrupt VOBs are the biggest pain. First of all, you have to download DVD Decrypter to solve this problem. Second of all, it is a trial and error method where you had to calculate which megabyte was failing and uncheck the corresponding cell and try again, only to find another broken cell.

In addition, it would take out major chunks of data from your VOB per cell so if you had a tiny scratch, you potentially lost a few megs of data instead of a few 100K. If you had multiple corrupt VOBs, good luck maintaining high spirits.

Using VOBMerge to merge related VOBS

We have already been forced to copy the VOBs to the hard drive for VOBMerge to work. Now, VOBMerge will happily double the data on our hard drive through the merging process.

Running two batch files on each merged VOB

The merged VOB now has to be processed by two different batch files, one to create a downloadable version of the video and the other for archival purposes. These steps could take a lot of time so you could expect to lose valuable CPU time while you decided to take a walk.

Repeat for each unique video

Okay, so we got done with the first video. Now we have to repeat the merging and encoding all over again for every other video. Let me also mention the time spent deleting and copying VOB files around in between encodes in case you don’t have enough space to accommodate the entire DVD, merged VOBs and the final encoded output all at once.

The Solution

Instead of dealing with so many external utilities, I just whipped together a Python script that does all of the above in one step. All you need to do is decide how the seperate VOBs need to be merged, what the output videos need to be named and an acceptable buffer size in case some VOBs are corrupt.

Below are the command line options of vobenc.py.

Usage:
  -j joblist         : List of jobs to execute
  -f jobfile         : File containing list of jobs to execute
  -s steplist        : List of step ids from config.ini to execute on each job
                       [default: all steps]
                       E.g. -s 1,2,3
  -p pathtovobfiles  : Location of the VOB files
                       [default: current directory]
  -d destinationpath : Location where converted files should be stored
                       [default: current directory]
  -c configfile      : Name of configuration file to load steps from
                       [default: config.ini]
  -b buffersize      : Number of bytes to copy at a time
                       [default: 102400 = 100K]

Job List Syntax:
  -j filename1=id1,id2:filename2=id3,id4

  filename           : Name of the output file
  id1,id2,...        : VTS_x_y.VOB files where the id is x

  E.g. -j outfile1=1,2,3:outfile2=4,5:outfile3=6

Job File Syntax:
  -f jobfilename

  E.g.
    outfile1=1,2,3
    outfile2=4,5
    outfile3=6

The above goes into various concepts which are further explained below:-

Jobs

Jobs allow us to specify all the output videos that need to be encoded as well as associate the output videos with the input VOB files. They can be specified on the command line using -j or in a file using -f. One or two jobs look fine on the command line but if you have more, it might be easier to create a job file and pass it onto vobenc.py.

Jobs automate the task of merging VOB files as well as running through every step for each output video.

Steps

In my original solution, there were essentially three steps that each video went through. The first and second steps were the first and second pass encoding of the video for high quality archival. The third pass was to encode the downloadable version. These steps line up with the concept of steps in vobenc.py.

Below is the corresponding config.ini step file for what was proposed in the original solution. A downloadable version is further below.

[step1]
name = High Quality Pass 1
command = mencoder - -o nul -oac mp3lame -lameopts br=128:cbr:aq=0 -ovc xvid -xvidencopts bitrate=1150:vhq=4:pass=1 -msglevel all=5

[step2]
name = High Quality Pass 2
command = mencoder - -o #OUTFILE#-hi.avi -oac mp3lame -lameopts br=128:cbr:aq=0 -ovc xvid -xvidencopts bitrate=1150:vhq=4:pass=2 -msglevel all=5

[step3]
name = Low Quality
command = mencoder - -o #OUTFILE#-lo.mpg -oac lavc -srate 24000 -ovc lavc -lavcopts acodec=mp3:abitrate=24:vcodec=wmv2:vbitrate=96 -vf scale=264:180 -ofps 15 -msglevel all=1

Using the above format, we could have any number of steps. Also note the #OUTFILE# directive that allows us to pass on the output file specified in the job to the encoding utility.

The way config.ini is laid out, we could pass the VOB input to any utility that would handle the VOB data from STDIN. In the above example, we are using mencoder. We could just as well use ffmpeg or any other command line encoder we prefer.

Steps allow us to automate the various steps that each individual video needs to go through in order to produce the output files.

Buffer Size

Vobenc.py uses a default buffer size of 100K. This means that it attempts to read 100K of data from the DVD at a time. On a 1.4GB DVD containing 74 minutes of video, 100K is approximately 1/3rd of a second of information and is a fair compromise.

When a buffer read fails on the DVD, the entire buffer is skipped and the next buffer is read. This allows us to skip corrupt sections of the VOB without having to manually decide what to do.

If the buffer is too large, this will result in more data loss than required. On the other hand, a very small buffer size will affect read speed since smaller chunks will be skipped on corrupt VOBs leading to more IO errors and timeouts. As a result, the buffer size is changeable by the user depending on the state of the DVD. A DVD which is relatively error free could use a larger buffer size, speeding up the read whereas a DVD with a fair amount of damage could use a smaller buffer size to squeeze out as much information as is possible.

Note, however, that very large buffer sizes are not very useful since the main bottleneck is the actual encoding process. The encoder can only handle so many megs of data at a time.

Vobenc.py displays the amount of data loss per VOB at the end of the encoding process.

Download

Vobenc.py is being distributed as a Python script. You will need the Python interpreter installed on your system in order to run it.

Summary

I recently used vobenc.py to encode an entire 1.4GB DVD. It took an entire night to get through the encoding but the entire task was seamless and automated. All I had to do was create a job file and that took all of five minutes.

I’ll be maintaining vobenc.py on this blog post for now. Feel free to provide your feedback.

Advertisements

Filed under: Tips

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Twitter Updates

%d bloggers like this: