Parsing Blu-ray MPLS Playlist Files

  • keywords:
    • software
    • python
    • bluray
    • json
  • published:
  • RSS Feed

I have a nice little Python function, return_dict_of_media_audio_streams, that uses the JSON output format from ffprobe -print_format json -show_streams /path/to/file to create a Python dictionary of information about all of the audio streams within a media file. However, ffprobe v3.4 doesn't return the language information for any of the audio streams in a Blu-ray playlist. For example, running ffprobe -probesize 3G -analyzeduration 1800M -playlist 820 bluray:/path/to/br yields:

Note the square brackets just after each stream definition, these are where the language code should be. I decided to do a little bit of investigating to see how hard it would be for me to grab the language codes myself and add them to the dictionary in my Python function. I decided that instead of writing feature requests in both ffmpeg and libbluray it would be much quicker to code up a binary reader for the file and add its data to the data provided by ffprobe myself.

My first action was to open up the binary file "00820.mpls" in a text editor - and behold, there were the language codes surrounded by a bunch of binary gibberish. I clearly just needed a MPLS parser, however, searching online did not yield any results. A lot of searching did manage to produce some unofficial documentation on the file format for this small binary file. The two best resources that I found were:

It turns out that the file is in big-endian too so I used the Python function struct.unpack to convert each entry to the appropriate type and kind. Following the documentation to make a parser for all of the data structures within the file wasn't that hard at all, it was more of a bore than anything else. As an aside, the largest MPLS file that I have ever seen is 78 kiB and the smallest Blu-ray disc that can possibly exist is 25 GB so why the playlist information isn't stored as a JSON or XML file I have no idea (there will always be some space left on the disc for a slightly larger text format).

My function return_dict_of_media_audio_streams now calls my new sub-module, pyguymer.MPLS, and adds the language code to each stream when a Blu-ray is passed. This means that I can now choose which stream to use based off its language.