HTML5 VIDEO bytes on iOS

April 21, 2013 8:03 pm | 18 Comments

UPDATE: The problem is worse on iOS 7. Although the Apple developer guide still states that VIDEO PRELOAD is disabled on iOS, navigating to a page that uses the VIDEO tag results in large amounts of video data being preloaded. 81K  of video is preloaded for the 4M VIDEO test (compared to 61K in iOS 6). For the 62M VIDEO test, iOS 7 preloads 542K of video data (compared to 298K for iOS 6). If you use iOS 7 and pay for data be careful about navigating to pages that contain video even if you don’t play the video.

HTML5 provides the VIDEO element. It includes the PRELOAD attribute that takes various values such as “none”, “metadata”, and “auto”. Mobile devices ignore all values of PRELOAD in order to avoid high data plan costs, and instead only download the video when the user initiates playback. This is explained in the Safari Developer Library:

In Safari on iOS (for all devices, including iPad), where the user may be on a cellular network and be charged per data unit, preload and autoplay are disabled. No data is loaded until the user initiates it.

However, my testing shows that iOS downloads up to 298K of video data, resulting in unexpected costs to users.


In my previous post, HTML5 Video Preload, I analyzed how much data is buffered for various values of the VIDEO tag’s PRELOAD attribute. For example, specifying preload='none' ensures no video is preloaded, whereas preload='auto' results in 25 or more seconds of buffered video on desktop browsers depending on the size of the video.

The results for mobile browsers are different. Mobile browsers don’t preload any video data, no matter what PRELOAD value is specified. These preload results are based on measuring the VIDEO element’s buffered property. Using that API shows that zero bytes of data is buffered on all mobile devices including the iPhone. (Look for “Mobile Safari 6” as well as a dozen other mobile devices in the detailed results.)

While it’s true that Mobile Safari on iOS doesn’t buffer any video data as a result of the PRELOAD attribute, it does make other video requests that aren’t counted as “buffered” video. The number and size of the requests and responses depends on the video. For larger videos the total amount of data for these behind-the-scenes requests can be significant.

Unseen VIDEO Requests

[This section contains the technical details of how I found these video requests. Go to Confirming Results, More Observations if you want to skip over these details.]

While testing VIDEO PRELOAD on my iPhone, I noticed that even though the amount of buffered data was “0”, there were still multiple video requests hitting my server. Here are the video requests I see in my Apache access log when I load the test page with preload=’none’ (4M video) on my iPhone (iPhone4 iOS6 running standard mobile Safari):

[16/Apr/2013:15:03:48 -0700] "GET /tests/trailer.mp4?t=1366149827 HTTP/1.1" 206 319 "-" "AppleCoreMedia/ (iPhone; U; CPU OS 6_1_2 like Mac OS X; en_us)"
[16/Apr/2013:15:03:48 -0700] "GET /tests/trailer.mp4?t=1366149827 HTTP/1.1" 206 70080 "-" "AppleCoreMedia/ (iPhone; U; CPU OS 6_1_2 like Mac OS X; en_us)"
[16/Apr/2013:15:03:48 -0700] "GET /tests/trailer.mp4?t=1366149827 HTTP/1.1" 206 47330 "-" "AppleCoreMedia/ (iPhone; U; CPU OS 6_1_2 like Mac OS X; en_us)"

There are three requests. They all return a “206 Partial Content” status code. The sizes of the responses shown in the access log are 319 bytes, 70,080 bytes, and 47,330 bytes respectively.

It’s possible that the server sends bytes but the client (my iPhone) doesn’t receive every packet. Since these video requests aren’t reflected using the VIDEO element’s API, I measure the actual bytes sent over the wire using tcpdump and a wifi hotspot. (See the setup details here.) This generates a pcap file that I named mediaevents-iphone.pcap. The first step is to find the connections used to make the video requests using:

tcpdump -qns 0 -A -r mediaevents-iphone.pcap

The output shows the HTTP headers and response data for every request. Let’s find the three video requests.

video request #1

Here’s the excerpt relevant for the first video request (with some binary data removed):

15:03:48.162970 IP > tcp 327
GET /tests/trailer.mp4?t=1366149827 HTTP/1.1
Range: bytes=0-1
X-Playback-Session-Id: E4C46EAC-17EC-491E-81F7-4EBF4B7BE12B
Accept-Encoding: identity
Accept: */*
Accept-Language: en-us
Connection: keep-alive
User-Agent: AppleCoreMedia/ (iPhone; U; CPU OS 6_1_2 like Mac OS X; en_us)

15:03:48.183732 IP > tcp 0
15:03:48.186327 IP > tcp 319
HTTP/1.1 206 Partial Content
Date: Tue, 16 Apr 2013 22:03:48 GMT
Server: Apache
Last-Modified: Thu, 13 May 2010 17:49:03 GMT
ETag: "42b795-4867d5fcac1c0"
Accept-Ranges: bytes
Content-Length: 2
Content-Range: bytes 0-1/4372373
Keep-Alive: timeout=2, max=100
Connection: Keep-Alive
Content-Type: video/mp4

I highlighted some important information. This first request for trailer.mp4 occurs on connection #49186. The iPhone only requests bytes 0-1, and the server returns only those two bytes. The Content-Range response header also indicates the total size of the video: 4,372,373 bytes (~4.2M).

video request #2

Here’s the excerpt relevant for the second video request:

15:03:48.325209 IP > tcp 333
GET /tests/trailer.mp4?t=1366149827 HTTP/1.1
Range: bytes=0-4372372
X-Playback-Session-Id: E4C46EAC-17EC-491E-81F7-4EBF4B7BE12B
Accept-Encoding: identity
Accept: */*
Accept-Language: en-us
Connection: keep-alive
User-Agent: AppleCoreMedia/ (iPhone; U; CPU OS 6_1_2 like Mac OS X; en_us)

15:03:48.349254 IP > tcp 1460
HTTP/1.1 206 Partial Content
Date: Tue, 16 Apr 2013 22:03:48 GMT
Server: Apache
Last-Modified: Thu, 13 May 2010 17:49:03 GMT
ETag: "42b795-4867d5fcac1c0"
Accept-Ranges: bytes
Content-Length: 4372373
Content-Range: bytes 0-4372372/4372373
Keep-Alive: timeout=2, max=99
Connection: Keep-Alive
Content-Type: video/mp4

This request also uses connection #49186. The iPhone requests bytes 0-4372372 (the entire video). The Content-Length header implies that 4,372,373 bytes are returned but we’ll soon see it’s much less than that.

video request #3

Here’s the excerpt relevant for the third video request:

15:03:48.409745 IP > tcp 339
GET /tests/trailer.mp4?t=1366149827 HTTP/1.1
Range: bytes=4325376-4372372
X-Playback-Session-Id: E4C46EAC-17EC-491E-81F7-4EBF4B7BE12B
Accept-Encoding: identity
Accept: */*
Accept-Language: en-us
Connection: keep-alive
User-Agent: AppleCoreMedia/ (iPhone; U; CPU OS 6_1_2 like Mac OS X; en_us)

15:03:48.435684 IP > tcp 0
15:03:48.438211 IP > tcp 1460
HTTP/1.1 206 Partial Content
Date: Tue, 16 Apr 2013 22:03:48 GMT
Server: Apache
Last-Modified: Thu, 13 May 2010 17:49:03 GMT
ETag: "42b795-4867d5fcac1c0"
Accept-Ranges: bytes
Content-Length: 46997
Content-Range: bytes 4325376-4372372/4372373
Keep-Alive: timeout=2, max=100
Connection: Keep-Alive
Content-Type: video/mp4

The third request is issued on a new connection: #49187. The iPhone requests the last 46,997 bytes of the video. This is likely the video’s metadata (or “moov atom”).

total packet size

I noted the connection #s because in my next step I use those to view the video file packets using this command:

tcpdump -r mediaevents-iphone.pcap | grep -E "(49186|49187)"

Here’s the output. It’s long.

15:03:48.147958 IP > Flags [.], ack 1, win 16384, length 0
15:03:48.162970 IP > Flags [P.], seq 1:328, ack 1, win 16384, length 327
15:03:48.183732 IP > Flags [.], ack 328, win 14, length 0
15:03:48.186327 IP > Flags [P.], seq 1:320, ack 328, win 14, length 319
15:03:48.190322 IP > Flags [.], ack 320, win 16364, length 0
15:03:48.325209 IP > Flags [P.], seq 328:661, ack 320, win 16384, length 333
15:03:48.349254 IP > Flags [.], seq 320:1780, ack 661, win 16, length 1460
15:03:48.349480 IP > Flags [.], seq 1780:3240, ack 661, win 16, length 1460
15:03:48.349520 IP > Flags [.], seq 3240:4700, ack 661, win 16, length 1460
15:03:48.350034 IP > Flags [.], seq 4700:6160, ack 661, win 16, length 1460
15:03:48.355531 IP > Flags [.], ack 3240, win 16292, length 0
15:03:48.355699 IP > Flags [.], ack 6160, win 16110, length 0
15:03:48.366281 IP > Flags [F.], seq 661, ack 6160, win 16384, length 0
15:03:48.378012 IP > Flags [.], seq 6160:7620, ack 661, win 16, length 1460
15:03:48.378086 IP > Flags [.], seq 7620:9080, ack 661, win 16, length 1460
15:03:48.378320 IP > Flags [S], seq 3267277275, win 65535, options [mss 1460,nop,wscale 4,nop,nop,TS val 129308742 ecr 0,sack\
OK,eol], length 0
15:03:48.378417 IP > Flags [.], seq 9080:10540, ack 661, win 16, length 1460
15:03:48.378675 IP > Flags [.], seq 10540:12000, ack 661, win 16, length 1460
15:03:48.379705 IP > Flags [.], seq 12000:13460, ack 661, win 16, length 1460
15:03:48.379760 IP > Flags [.], seq 13460:14920, ack 661, win 16, length 1460
15:03:48.383237 IP > Flags [R], seq 3028115607, win 0, length 0
15:03:48.383430 IP > Flags [R], seq 3028115607, win 0, length 0
15:03:48.383761 IP > Flags [R], seq 3028115607, win 0, length 0
15:03:48.384064 IP > Flags [R], seq 3028115607, win 0, length 0
15:03:48.384297 IP > Flags [R], seq 3028115607, win 0, length 0
15:03:48.385553 IP > Flags [R], seq 3028115607, win 0, length 0
15:03:48.399069 IP > Flags [S.], seq 277046834, ack 3267277276, win 5840, options [mss 1460,nop,nop,sackOK,nop,wscale 9], len\
gth 0
15:03:48.401299 IP > Flags [.], ack 1, win 16384, length 0
15:03:48.409745 IP > Flags [P.], seq 1:340, ack 1, win 16384, length 339
15:03:48.435684 IP > Flags [.], ack 340, win 14, length 0
15:03:48.438211 IP > Flags [.], seq 1:1461, ack 340, win 14, length 1460
15:03:48.438251 IP > Flags [.], seq 1461:2921, ack 340, win 14, length 1460
15:03:48.438671 IP > Flags [.], seq 2921:4381, ack 340, win 14, length 1460
15:03:48.444261 IP > Flags [.], ack 2921, win 16201, length 0
15:03:48.446763 IP > Flags [.], ack 4381, win 16384, length 0
15:03:48.464208 IP > Flags [.], seq 4381:5841, ack 340, win 14, length 1460
15:03:48.464589 IP > Flags [.], seq 5841:7301, ack 340, win 14, length 1460
15:03:48.465088 IP > Flags [.], seq 7301:8761, ack 340, win 14, length 1460
15:03:48.469165 IP > Flags [.], seq 8761:10221, ack 340, win 14, length 1460
15:03:48.469269 IP > Flags [.], seq 10221:11681, ack 340, win 14, length 1460
15:03:48.474194 IP > Flags [.], ack 7301, win 16201, length 0
15:03:48.474491 IP > Flags [.], ack 10221, win 16019, length 0
15:03:48.481544 IP > Flags [.], ack 11681, win 16384, length 0
15:03:48.495319 IP > Flags [.], seq 11681:13141, ack 340, win 14, length 1460
15:03:48.495878 IP > Flags [.], seq 13141:14601, ack 340, win 14, length 1460
15:03:48.495928 IP > Flags [.], seq 14601:16061, ack 340, win 14, length 1460
15:03:48.500477 IP > Flags [.], seq 16061:17521, ack 340, win 14, length 1460
15:03:48.500542 IP > Flags [.], seq 17521:18981, ack 340, win 14, length 1460
15:03:48.500743 IP > Flags [.], seq 18981:20441, ack 340, win 14, length 1460
15:03:48.503543 IP > Flags [.], ack 14601, win 16201, length 0
15:03:48.504110 IP > Flags [.], ack 17521, win 16019, length 0
15:03:48.508600 IP > Flags [.], ack 20441, win 16201, length 0
15:03:48.524397 IP > Flags [.], seq 20441:21901, ack 340, win 14, length 1460
15:03:48.524753 IP > Flags [.], seq 21901:23361, ack 340, win 14, length 1460
15:03:48.524773 IP > Flags [.], seq 23361:24821, ack 340, win 14, length 1460
15:03:48.525207 IP > Flags [.], seq 24821:26281, ack 340, win 14, length 1460
15:03:48.525756 IP > Flags [.], seq 26281:27741, ack 340, win 14, length 1460
15:03:48.533224 IP > Flags [.], ack 23361, win 16201, length 0
15:03:48.533395 IP > Flags [.], ack 26281, win 16019, length 0
15:03:48.533553 IP > Flags [.], ack 27741, win 16384, length 0
15:03:48.533714 IP > Flags [.], seq 27741:29201, ack 340, win 14, length 1460
15:03:48.534656 IP > Flags [.], seq 29201:30661, ack 340, win 14, length 1460
15:03:48.534753 IP > Flags [.], seq 30661:32121, ack 340, win 14, length 1460
15:03:48.534835 IP > Flags [.], seq 32121:33581, ack 340, win 14, length 1460
15:03:48.535268 IP > Flags [.], seq 33581:35041, ack 340, win 14, length 1460
15:03:48.535933 IP > Flags [.], seq 35041:36501, ack 340, win 14, length 1460
15:03:48.540701 IP > Flags [.], ack 30661, win 16292, length 0
15:03:48.540792 IP > Flags [.], ack 33581, win 16110, length 0
15:03:48.540966 IP > Flags [.], ack 36501, win 15927, length 0
15:03:48.555942 IP > Flags [.], seq 36501:37961, ack 340, win 14, length 1460
15:03:48.556276 IP > Flags [.], seq 37961:39421, ack 340, win 14, length 1460
15:03:48.556767 IP > Flags [.], seq 39421:40881, ack 340, win 14, length 1460
15:03:48.556810 IP > Flags [.], seq 40881:42341, ack 340, win 14, length 1460
15:03:48.557342 IP > Flags [.], seq 42341:43801, ack 340, win 14, length 1460
15:03:48.557818 IP > Flags [.], seq 43801:45261, ack 340, win 14, length 1460
15:03:48.565012 IP > Flags [.], ack 39421, win 16201, length 0
15:03:48.565188 IP > Flags [.], ack 42341, win 16019, length 0
15:03:48.565232 IP > Flags [.], ack 45261, win 15836, length 0
15:03:48.565417 IP > Flags [.], seq 45261:46721, ack 340, win 14, length 1460
15:03:48.565477 IP > Flags [P.], seq 46721:47331, ack 340, win 14, length 610
15:03:48.571326 IP > Flags [.], ack 46721, win 16384, length 0
15:03:48.571664 IP > Flags [.], ack 47331, win 16345, length 0
15:03:50.444574 IP > Flags [F.], seq 47331, ack 340, win 14, length 0
15:03:50.471892 IP > Flags [.], ack 47332, win 16384, length 0

There’s a lot of interesting things to see in there, but staying on task let’s figure out how much video data was actually received by the iPhone. We’ll look at packets sent from the server ( to my iPhone ( and add up the length values. The total comes to 62,249 bytes (~61K). Separating it by request comes to 319 bytes for the first response, 14,600 bytes for the second response, and 47,330 bytes for the third response. Let’s compare this true byte size to what we saw in the Apache access logs and what was in the Content-Length response headers:

Apache access log Content-Length actual bytes received
request #1 319 bytes 2 bytes 319 bytes
request #2 70,080 bytes 4,372,373 bytes 14,600 bytes
request #3 47,330 bytes 46,997 bytes 47,330 bytes

What this tells us is you can’t always trust what is shown in the server access logs and even the Content-Length headers. The sizes of requests 1 and 3 are consistent when you take into account that the size of headers is not included in the Content-Length. Request #2’s sizes are all over the place. Looking at the packets in more detail reveals why: the iPhone tried to close the connection after receiving 5,840 bytes and then finally reset the connection after 14,600 bytes. Thus, the actual amount of data downloaded was different than what the access log and Content-Length indicated.

The point is, some days you have to drop down into tcpdump and pcap files to get the truth. For this ~4M video looking at the pcap files shows that iOS downloaded ~61K of video data. (Thanks to Arvind Jain for helping me decipher the pcap files!)

Confirming Results, More Observations

Here are some additional observations from my testing.

Larger videos => more data: The amount of data the iPhone downloads increases for larger videos. The preload=’none’ test page for the 62M video generates seven video requests. Using the same tcpdump technique shows this larger video resulted in 304,918 bytes (~298K) of video data being downloaded.

Data not “re-used”: To make matters worse, this video data that is downloaded behind-the-scenes doesn’t reduce the download size when the user initiates playback. To test this I started a new tcpdump capture and started playing the 4.2M video. This resulted in downloading 4,401,911 bytes (~4.2M) – the size of the entire video.

Happens on cell networks: All of the tests so far were done over wifi, but data plan costs only occur over mobile networks. One possibility is that the iPhone downloads this video data on wifi, but not over carrier networks. Unfortunately, that’s not the case. I tested this by turning off wifi on my iPhone, closing all apps, resetting my cellular usage statistics, and loading the test pages five times.

For the test page with preload=’none’ (4M video) the amount of Received cellular data is 354K for all five page loads, or ~71K per page. The test page without the VIDEO tag is ~8K, so this closely matches the earlier findings that ~61K of video data is being downloaded in the background. For the preload=’none’ test page for the 62M video the amount of Received cellular data is 1.6M for five page loads, or ~320K per page. After accounting for the size of the test page and rounding errors this is close to the ~298K of video data that was found for this larger file. The conclusion is that these unseen video requests occur when the iPhone is on a cellular network as well as on wifi.

Only happens on iOS: These extraneous video requests don’t happen on my Samsung Galaxy Nexus (Android 4.2.2) using the default Android Browser as well as Chrome for Android 26. Because the tests are so hands-on I can’t test other phones jammers using Browserscope. However, knowing that a dozen or so different mobile devices ran the test on my server I searched through my access logs from the point where I published the HTML5 Video Preload blog post to see if any of them generated requests for trailer.mp4. The only ones that showed up were iPhone and iPad, suggesting that none of the other mobile devices generate these background video requests.

Strange User-Agents, no Referer: The User-Agent request header for mobile Safari on my iPhone is:

Mozilla/5.0 (iPhone; CPU iPhone OS 6_1_2 like Mac OS X) AppleWebKit/536.26 (KHTML, like Gecko) Version/6.0 Mobile/10B146 Safari/8536.25

But the User-Agent for the video requests is:

AppleCoreMedia/ (iPhone; U; CPU OS 6_1_2 like Mac OS X; en_us)

They’re different!

My Galaxy Nexus has similar behavior. The User-Agent for Android browser is:

Mozilla/5.0 (Linux; U; Android 4.2.2; en-us; Galaxy Nexus Build/JDQ39) AppleWebKit/534.30 (KHTML, like Gecko) Version/4.0 Mobile Safari/534.30

Chrome for Android is:

Mozilla/5.0 (Linux; Android 4.2.2; Galaxy Nexus Build/JDQ39) AppleWebKit/537.31 (KHTML, like Gecko) Chrome/26.0.1410.58 Mobile Safari/537.31

But both send this User-Agent when requesting video:

stagefright/1.2 (Linux;Android 4.2.2)

It’s good to keep this in mind when investigating video requests on mobile. Another hassle is that all three of these mobile browsers omit the Referer request header. This seems like a clear oversight that makes it hard to correlate video playback with page views.

No Good Workaround

I mentioned this blog post to my officemate and he pointed out Jim Wilson’s comments in his post on Breaking the 1000ms Time to Glass Mobile Barrier. It appears Jim also saw this unseen video requests issue. He went further to find a way to avoid bandwidth contention:

We found that if the <video> tag had any information at all, the mobile device would try to oblige it. What we wanted was for something to start happening right away (e.g. showing the poster) but if we gave the video tag something to chew on it would slow down the device. From the user’s perspective nothing was happening.

So our latest version inlines critical styles into the head, uses a <div> with background:url() for the poster, has an empty <video> tag, dynamically loads Video.js, and when it’s done (onready/onload) sets the source through the Video.js api.

I added the emphasis to highlight his solution. The VIDEO tag’s markup doesn’t specify the SRC attribute. This avoids any behind-the-scenes video requests on iOS that contend for bandwidth with resources that are visible in the page (such as the POSTER image). The SRC is set later via JavaScript.

While this technique avoids the issue of bandwidth contention, it doesn’t avoid the extra video requests. When the SRC is set later it results in video data being downloaded. In fact, when I tested this technique it actually resulted in more video data being downloaded leading to even higher data costs.

It’s possible that moving the metadata to the front of the video file may reduce the amount of data downloaded. The 19M video from Video.js is formatted this way and downloads less video data than my 4M test video. Arranging the MP4 data more efficiently should be further investigated to see if it can reduce the amount of data downloaded.


When the HTML5 VIDEO tag is used, iOS downloads video data without the user initiating playback. In the tests described here the amount of video data downloaded ranged from 61K to 298K. This behavior differs from other mobile devices. This means simply visiting a page that uses the VIDEO element on an iPhone or iPad could result in unexpected cellular network data charges. Unfortunately, there’s no good workaround to avoid these extra video requests. I’ve submitted a bug to Apple asking that iOS avoid these extra data costs similar to other mobile devices.


18 Responses to HTML5 VIDEO bytes on iOS

  1. Interesting! Nothing brings up the truth like the bowels of Apache log. Can anyone confirm if that happens on other versions of iOS or this is a version-specific bug?

  2. It would be nice to do the same tests with the MP4’s metadata re-arranged. Especially the moov atom which better be before the mdat atom.

    If the moov are positionned after the mdat, CoreMedia will have to download the whole file before doing anything. It surely doesn’t explain why there is some requests with preload=”none” but might reduce them.

    Thanks for the post, very useful.

  3. Very interesting. It looks like to all intents and purposes the browsers actually use plugins to play video. The plugins have their own logic. This could be confirmed by adding filters to block CoreMedia and StageFright and see whether they can actually play the video.

  4. It is downloading a poster image, and the metadata to determine if the video can be played and how long it is. In your case, you have the meta data at the end of the file, so there one extra web requests.

    This is a feature. The iOS playback experience for HTML5 video is pretty good. On android, not so much.

  5. Does it still perform the third request if the video is hinted for streaming? I would expect it not to if it finds MOOV at the start of the file.

  6. John Campbell: This test page does not have the POSTER attribute. Are you sure it’s downloading a poster image? I don’t think it is. I don’t think this is a feature because it contradicts the Safari docs and differs from other mobile browsers. But also, in this case I have preload='none' which is a hint that the video is UNlikely to be played, so downloading more data to improve the playback experience is not the right thing to do.

  7. I think that if the poster attribute is not specified, then Safari on iOS will attempt to download part of the video stream in order to create a “dynamic” poster.

  8. Jim: I don’t think the video requests create an image. Notice that the video is a black rectangle in the test page.

  9. I examined in Premier and noticed that the first 16 frames are indeed black. So that it is consistent with the iOS player attempting to create a default poster image when none is explicitly supplied. However, I then took another video which timecode embedded in it (so we can see if and when a poster is being produced) and mounted it in a page with the three preload settings . On iOS *none* of the options show a poster, similarly on Android 4.1. On Chrome desktop, only preload-auto produced an auto-poster.

  10. Steve: I thought I ran into this problem last October, but I’ve deleted the relevant capture files. I was trying to debug using Wowza to re-stream H.264 RTSP as “HTTP Live Streaming”, so I was more concerned at the time with getting the video to play on iOS than any weird side-effects. I’ll look into it further — it’s possible I’m misremembering what happened.

    Will: The preload test page is behaving strangely for me on iOS. I sometimes (but not always) get a background image (a white timestamp of “00;00;00;00”) for the “auto” and “metadata” cases; the play icon also seems to generally appear first on the “none” case, and then much, much later on the “auto” and “metadata” cases.

    I have also seen the timestamp appear on the “none” case, but I think this only happens (on iOS) if you play the video first, so that the background/poster image is presumably cached.

  11. Will Law: What is the conclusion? Is there no poster because my trailer.mp4 video starts with black frames? Or does iOS not actually generate a poster (and thus the video requests are for something else)?

  12. Well, I just tried it locally using both a static .mp4 and an .m3u8 for HTTP Live Streaming, and it does not appear to generate a dynamic “poster” from the video file.

    So either something has changed or I’m getting confused. :P Sorry for the noise.

  13. Yup. Moving the moov atom to the beginning most likely will make a big difference. In fact any mp4 file delivered via HTTP should have its moov atom at the front, otherwise playback will commence only when the whole file is downloaded (not really a good user experience).

  14. Thanks for sharing the details Steve. Fascinating stuff for sure, trying to understand how a specific solution works when sometimes the developers don’t even know themselves.

    @Eswar I’ve actually run into that problem recently, curiously though the problem behaved differently on different browsers so it some extra time to track down.


  15. @Ahmad This behavior has been around a long time. I noticed it in fall of 2010 and blogged about it in Jan 2011:

    That said, I was focused on audio so the fast majority of the post is about AppleCoreMedia and odd behavior with audio. But at one point in the article, I talk about experimenting with a video file and finding that “Mobile Safari still downloads some part of the m4v file, but it isn’t the full file size (70500 bytes)” which is almost spot on what Steve noted as the Apache response #2 above.

    I never dug in further on the video side. But I think it is safe to assume this behavior has been around since 2011 and iOS 4.2.1.

  16. I should have known Grigs already blogged about this.

  17. @steve Well, I never dug into why it happens so my post was not of much use.

    FWIW, Mobile Safari—or more accurately, AppleCoreMedia—does something very similar with audio that makes pages unnecessarily bloated.

    Take for example. If you just look at what Akamai’s MobiTest reports for, you’ll see a small page of ~125K:

    You’ll get similar results if you use remote debugging by tethering your phone. It looks like a small page. It does have some interesting calls to MP3 that show no data transfer and no response codes. Screenshot here:

    BUT, if you watch the page load through a proxy (Charles to the rescue), you’ll see that there is downloading of portions of the MP3 files that don’t get reported by either the WebKit remote debugger or MobiTest. These downloads result in a total download of 3.04MB.

    A HAR file of my test using is available at:

    I don’t understand why iOS is doing this for either audio or video, but at least on the audio side, it is pretty unlikely that it is trying to make a poster. ;-)

  18. @Steve [threading back to comment #11] – my simple tests show that iOS does not generate a default poster, irrespective of the preload setting. The requests you see for iOS under preload=none are therefore most likely the browser trying to find the MOOV atom. IN a mp4 file, the first 4 bytes hold the size of ATOM and next 4 bytes FTYP. So the first two bytes don’t tell you anything useful about the file other than it exists and can be read. So I think that first request is just an existence check. It then hunts for the MOOV atom, which may be at the beginning, which is where it first tries. As soon as it has downloaded enough data to figure out the actual location of the MOOV, it abandons that first load and begins a second one, this time honing in on the precise byte range. The fact that it does all of this under preload=none is in contradiction to Apple’s docs and something one would expect under preload=metadata.