Post

Extract and Preserve Metadata in Videos From Sony Alpha Cameras

Extract and Preserve Metadata in Videos From Sony Alpha Cameras

Overview

Sony Alpha1 cameras embed detailed metadata in their video files. However, much of it is stored using non-standard methods that most tools and services fail to recognize.

As an example of where this fails, popular image services such as Google Photos or Immich will not read the GPS position from these videos, meaning they won’t show up on the map view or in location-based searches.

Additionally, transcoding tools may also drop the metadata entirely. On top of likely losing the GPS information, this usually also eliminates the metadata used in applications such as Gyroflow (if your specific model of camera records motion data).

Outside the standard QuickTime tags, which contain minimal information, there are two primary sources of metadata:

  1. Embedded XML Document: This resides in the root MP4 container and contains static metadata about the video and the camera. This information is also typically saved as a sidecar XML file alongside the video.

  2. Timestamped Data Stream: This is stored in a separate data track (identified with the handler type rtmd for “Realtime Metadata”) within the video file. It contains a stream of timestamped dynamic data points recorded during the video capture, such as:

    • F-stop, exposure, and ISO settings
    • GPS position (if available)
    • Camera orientation and acceleration (certain models)

Sample Video

This video will be used in the examples below. It was recorded with an a6400 camera.

C0140.MP4

The format of the video is detailed below. The rtmd stream (Stream #0:2[0x3]) contains the timestamped dynamic metadata.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
user@machine:~$ ffprobe C0140.MP4

Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'C0140.MP4':
  Metadata:
    compatible_brands: XAVCmp42iso2
    major_brand     : XAVC
    minor_version   : 16785407
    creation_time   : 2023-04-09T06:35:03.000000Z
    encoder         : Lavf60.16.100
  Duration: 00:00:21.03, start: 0.000000, bitrate: 22087 kb/s
  Stream #0:0[0x1](und): Video: hevc (Main) (hev1 / 0x31766568), yuv420p(tv, bt709/bt709/iec61966-2-4, progressive), 1920x1080 [SAR 1:1 DAR 16:9], 20055 kb/s, 59.94 fps, 59.94 tbr, 60k tbn (default)
    Metadata:
      creation_time   : 2023-04-09T06:35:03.000000Z
      handler_name    : Video Media Handler
      vendor_id       : [0][0][0][0]
      encoder         : AVC Coding
  Stream #0:1[0x2](und): Audio: pcm_s16be (ipcm / 0x6D637069), 48000 Hz, stereo, s16, 1536 kb/s (default)
    Metadata:
      creation_time   : 2023-04-09T06:35:03.000000Z
      handler_name    : Sound Media Handler
      vendor_id       : [0][0][0][0]
  Stream #0:2[0x3](und): Data: none (rtmd / 0x646D7472), 491 kb/s (default)
    Metadata:
      creation_time   : 2023-04-09T06:35:03.000000Z
      handler_name    : Timed Metadata Media Handler
      timecode        : 03:37:00:34

Metadata Preservation

Many popular media tools, such as FFmpeg, will not recognize the metadata stream shown above, and will simply drop it when transcoding. In the example below, the rtmd stream does not exist in the transcoded video.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
user@machine:~$ ffmpeg -i C0140.MP4 -c:v libx265 -b:v 20M -c:a copy transcoded.MP4

Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'transcoded.MP4':
  Metadata:
    major_brand     : isom
    minor_version   : 512
    compatible_brands: isomiso2mp41
    encoder         : Lavf60.16.100
  Duration: 00:00:21.03, start: 0.000000, bitrate: 21104 kb/s
  Stream #0:0[0x1](und): Video: hevc (Main) (hev1 / 0x31766568), yuv420p(tv, bt709/bt709/iec61966-2-4, progressive), 1920x1080 [SAR 1:1 DAR 16:9], 19558 kb/s, 59.94 fps, 59.94 tbr, 60k tbn (default)
    Metadata:
      handler_name    : Video Media Handler
      vendor_id       : [0][0][0][0]
      encoder         : Lavc60.31.102 libx265
  Stream #0:1[0x2](und): Audio: pcm_s16be (ipcm / 0x6D637069), 48000 Hz, stereo, s16, 1536 kb/s (default)
    Metadata:
      handler_name    : Sound Media Handler
      vendor_id       : [0][0][0][0]

If you attempt to force FFmpeg to copy the stream using -map 0, you will receive an error about an unknown codec.

1
2
3
4
5
6
user@machine:~$ ffmpeg -i C0140.MP4 -map 0 -c:v libx265 -b:v 20M -c:a copy transcoded.MP4

[mp4 @ 0x55c1d03f1440] Could not find tag for codec none in stream #2, codec not currently supported in container
[out#0/mp4 @ 0x55c1d03b5cc0] Could not write header (incorrect codec parameters ?): Invalid argument
Error while filtering: Invalid argument
[out#0/mp4 @ 0x55c1d03b5cc0] Nothing was written into output file, because at least one of its streams received no packets.

Using a tool like MP4Box, which operates at a lower level on the MP4 container, we can copy the meta box or data stream from the original file without knowing the actual format of the data.

1
2
3
4
5
6
# Copy the timed metadata to the transcoded video
MP4Box -add C0140.MP4#3 transcoded.MP4

# Copy the metadata atom to the transcoded video
MP4Box -dump-xml /tmp/C0140.xml C0140.MP4
MP4Box -set-meta META -set-xml /tmp/C0140.xml transcoded.MP4

Copying GPS Position to Standard Tags

Because the GPS position is not written to standard QuickTime tags, image storage services such as Google Photos or Immich will not recognize it.

ExifTool is capable of reading many proprietary formats, including those found in videos from Sony cameras. Using the -ee/-extractEmbedded flag, it will extract the GPS position from the embedded metadata, which will be exposed through the composite GPSLatitude, GPSLongitude, and GPSPosition tags.

Immich actually uses ExifTool to read metadata, but will miss this information because it doesn’t use the required flags.

1
2
3
4
5
6
7
8
9
10
11
user@machine:~$ exiftool -extractEmbedded --duplicates -location:all C0140.MP4

GPS Version ID                  : 2.2.0.0
GPS Latitude Ref                : North
GPS Longitude Ref               : East
GPS Status                      : Measurement Active
GPS Measure Mode                : 2-Dimensional Measurement
GPS Map Datum                   : WGS-84
GPS Latitude                    : 34 deg 41' 12.43" N
GPS Longitude                   : 135 deg 31' 32.02" E
GPS Position                    : 34 deg 41' 12.43" N, 135 deg 31' 32.02" E

Using ExifTool again, we can write this extracted information into standard tags that most tools and services will recognize.

1
exiftool -extractEmbedded --duplicates '-GPSLatitude<$GPSLatitude' '-GPSLongitude<$GPSLongitude' '-UserData:GPSCoordinates<$GPSPosition' C0140.MP4

Utility Scripts

This repository contains a set of scripts that can be used to work with videos from Sony cameras. Most of them are simply wrappers around the commands described above.

These scripts require FFmpeg, MP4Box, and ExifTool to be installed.

video_convert.sh
This script uses FFmpeg and MP4Box to transcode a video while preserving all metadata.
copy_video_metadata.sh
This script will copy metadata from an original source file to an already existing transcoded file. This can be used for videos which were previously transcoded or if you don’t want to use as in the script above.
video_gps_standardize.sh
This will extract GPS coordinates from the embedded XML data and write them into more standard GPS metadata tags, making it compatible with platforms like Google Photos or Immich.
xml_to_xmp.py
This can be used to convert Sony XML metadata to the XMP format. The XML data can be extracted using the MP4Box command seen above, but is also usually found as a sidecar file alongside the original video.

Additional Notes

Currently, Gyroflow will not be able to load a transcoded video that had its metadata copied via MP4Box. This is due to a bug in mp4parse-rust, as noted in this GitHub issue. The telemetry-parser project which uses mp4parse-rust has removed the problematic meta-xml feature as of this commit, so some future version of Gyroflow should work correctly.

  1. This information may also apply to some models outside the Alpha series. ↩︎

This post is licensed under CC BY 4.0 by the author.