When square pixels aren't square

1 month ago (alexwlchan.net)

> Videos with non-square pixels are pretty rare...

Before HD, almost all video was non-square pixels. DVD is 720x480. SD channels on cable TV systems are 528x480.

  • >Before HD, almost all video was non-square pixels

    Correct. This came from the ITU-R BT.601 standard, one of the first digital video standards authors of which chose to define digital video as a sampled analog signal. Analog video never had a concept of pixels and operated on lines instead. The rate at which you could sample it could be arbitrary, and affected only the horizontal resolution. The rate chosen by BT.601 was 13.5 MHz, which resulted in a 10/11 pixel aspect ratio for 4:3 NTSC video and 59/54 for 4:3 PAL.

    >SD channels on cable TV systems are 528x480

    I'm not actually sure about America, but here in Europe most digital cable and satellite SDTV is delivered as 720x576i 4:2:0 MPEG-2 Part 2. There are some outliers that use 544x576i, however.

    • Good post. For anyone wondering "why do we have these particular resolutions, sampling and frame rates, which seem quite random", allow me to expand and add some color to your post (pun intended). Similar to how modern railroad track widths can be traced back to the wheel widths of roman chariots, modern digital video standards still reverberate with echoes from 1930s black-and-white television standards.

      BT.601 is from 1982 and was the first widely adopted analog component video standard (sampling analog video into 3 color components (YUV) at 13.5 MHz). Prior to BT.601, the main standard for video was SMPTE 244M created by the Society of Motion Picture and Television Engineers, a composite video standard which sampled analog video at 14.32 MHz. Of course, a higher sampling rate is, all things equal, generally better. The reason for BT.601 being lower (13.5 MHz) was a compromise - equal parts technical and political.

      Analog television was created in the 1930s as a black-and-white composite standard and in 1953 color was added by a very clever hack which kept all broadcasts backward compatible with existing B&W TVs. Politicians mandated this because they feared nerfing all the B&W TVs owned by voters. But that hack came with some significant technical compromises which complicated and degraded analog video for over 50 years. The composite and component sampling rates (14.32 MHz and 13.5 MHz) are both based on being 4x a specific existing color carrier sampling rate from analog television. And those two frequencies directly dictated all the odd-seeming horizontal pixel resolutions we find in pre-HD digital video (352, 704, 360, 720 and 768) and even the original PC display resolutions (CGA, VGA, XGA, etc). To be clear, analog television signals were never pixels. Each horizontal scanline was only ever an oscillating electrical voltage from the moment photons struck an analog tube in a TV camera to the home viewer's cathode ray tube (CRT). Early digital video resolutions were simply based on how many samples an analog-to-digital converter would need to fully recreate the original electrical voltage.

      For example, 720 is tied to 13.5 Mhz because sampling the active picture area of an analog video scanline at 13.5 MHz generates 1440 samples (double per-Nyquist). Similarly, 768 is tied to 14.32 MHz generating 1536 samples. VGA's horizontal resolution of 640 is simply from adjusting analog video's rectangular aspect ratio to be square (720 * 0.909 = 640). It's kind of fascinating all these modern digital resolutions can be traced back to decisions made in the 1930s based on which affordable analog components were available, which competing commercial interests prevailed (RCA vs Philco) and the political sensitivities present at the time.

      4 replies →

    • While analog video did not have the concept of pixels, it specified the line frequency, the number of visible lines (576 in Europe, composed of 574 full lines and 2 half lines, so some people count them as 575 lines, but the 2 half lines are located in 2 different lines of the image, not on the same line, thus there are 576 distinct lines on the height of the image), the duration of the visible part of a line and the image aspect as being 3:4.

      From these 4 values one can compute the video sampling frequency that corresponds to square pixels. For the European TV standard, an image with square pixels would have been of 576 x 768 pixels, obtained at a video sampling frequency close to 15 MHz.

      However, in order to allow more TV channels in the available bands, the maximum video frequency was reduced to a lower frequency than required for square pixels (which would have been close to 7.5 MHz in Europe) and then to an even lower maximum video frequency after the transition to PAL/SECAM, i.e. to lower than 5.5 MHz, typically about 5 MHz. (Before the transition to color, Eastern Europe had used sharper black&white signals, with a lower than 6.5 MHz maximum video frequency, typically around 6 MHz. The 5.5/6.5 MHz limits are caused by the location of the audio signal. France had used an even higher-definition B&W system, but that had completely different parameters than the subsequent SECAM, being an 819-line system, while the East-European system differed only in the higher video bandwidth.)

      So sampling to a frequency high enough for square pixels would have been pointless as the TV signal had been already reduced to a lower resolution by the earlier analog processing. Thus the 13.5 MHz sampling frequency chosen for digital TV, corresponding to pixels wider than their height, was still high enough to preserve the information contained in the sampled signal.

      2 replies →

    • My DVCAM equipment definitely outputs 720x576i, although whether that's supposed to render to 768x576, or 1024x576 for 16:9 stuff.

      It still looks surprisingly good, considering.

    • Yeah. I recently stumbled across this in an interesting way. Went down a rabbit hole. I was recreating an old game for my education[1]. Scummvm supports Eye of the Beholder and I used it to take screenshots to compare against my own work. I was doing the intro scenes and noticed that the title screens are 320x200. My monitor is 1920x1200 and so the ratios are the same. It displays properly when I full screen my game and all it good. However, on scummvm, it looked vertically elongated. I did some digging and found this about old monitors and how they displayed. Scummvm has a setting called "aspect ratio correction" which stretches the pixels vertically produces pillarboxing to give you the "original nostalgic feel".

      Notes: 1. https://eye-of-the-gopher.github.io/

  • I'm confused... what does DVD, SD or any arbitrary frame size have to do with the shape of pixels themselves? Is that not only relevant to the display itself and not the file format/container/codec?

    My understanding is that televisions would mostly have square/rectangular pixels, while computer monitors often had circular pixels.

    Or are you perhaps referring to pixel aspect ratios instead?

    • I'm not 100% sure I understand your question, but in order to display a DVD correctly, you need to either display the pixels stored in the video stream wider than they are tall (for widescreen), or narrower than they are tall (for 4:3). Displaying those pixels 1:1 on a display with square pixels would never be correct for DVD video.

    • CRTs didn't have pixels at all. They had shadow masks (or aperture grilles) and phosphors, which could be a triad of rectangles, lines spanning basically the entire screen height, or dots. They did not line up with the signal, so it doesn't make sense to call them pixels.

    • It comes about from digitizing analog video signals. The early standards for sampling (digitizing) analog video signals resulted in the digital pixel horizontal sample size (often) being wider than the line spacing of the displayed analog video. With the result that digitized video of analog signals usually has a "pixel size" (analog video has no concept of discrete horizontal pixels) that is wider than it is tall.

    • A square pixel has a 1:1 aspect ratio (width is the same as the height). Any other rectangular pixel with widths different than their heights would be considered "non-square".

      F.ex. in case of a "4:3 720x480" frame… a quick test: 720/4=180 and 480/3=160… 180 vs. 160… different results… which means the pixels for this frame are not square, just rectangular. Alternatively 720/480 vs. 4/3 works too, of course.

      3 replies →

    • CRTs do not have pixels. At all. The shapes you might see on the screen if you look really closely at it are solely different bands of color phosphors. CRTs are capable of drawing arbitrary beam shapes on them in one color[0]; but you need regularly spaced filters and phosphor patterns in order to get multiple colors. If these """pixels""" were bigger, you'd see a perfectly normal red part of the image, next to a perfectly normal green part of the image, next to a perfectly normal blue part of the image.

      What a CRT actually draws, though, are lines. Analog television is a machine that chops up a 2D plane into a stack of lines, which are captured, broadcasted, and drawn to the screen with varying intensity. Digital television - and, for that matter, any sort of computer display - absolutely does need that line to be divided into timesteps, which become our pixels. But when that gets displayed back on a CRT, the "pixels" stop mattering.

      In the domain of analog television, the only property of the video that's actually structural to the signal are the vertical and horizontal blanking frequencies - how many frames and lines are sent per second. The display's shape is implicit[1], you just have to send 480 lines, and then those lines get stretched to fit the width[2] of the screen. A digital signal being converted to analog can be anything horizontally. A 400x480 and a 720x480 picture will both be 4:3 when you display it on a 4:3 CRT.

      Pixel aspect ratio (PAR) is how the digital world accounts for the gap between pixels and lines. The more pixels you send per line, the thinner the pixels get. If you send exactly as many horizontal pixels as the line count times the display's aspect ratio, you get square pixels. For a 4:[3] monitor, that's 640 pixels, or 640x480. Note that that's neither the DVD nor the SD cable standard - so both had non-square pixels.

      Note that there is a limit to how many dots you can send. But this is a maximum - a limitation of the quality of the analog electronics and the amount of bandwidth available to the system. DVD and SD cable are different sizes from each other, but they both will display just fine even on an incredibly low-TVL[4] blurry mess of a 60s CRT.

      [0] There were some specialty tubes that could do "penetrative color", i.e. increasing the amplitude of the electron gun beyond a certain voltage value would change to a different color. This did not catch on.

      [1] As well as how many lines get discarded during vertical blanking, how big the overscan is, etc.

      [2] Nothing physical would stop you from making a CRT that scans the other way, but AFAIK no such thing exists. Even arcade cabinets with portrait (tate) monitors were still scanning by the long side of the display.

      [3] There's a standard for analog video transmission from 16:9 security cameras that have 1:1 pixel aspect ratio - i.e. more pixels per line. It's called 960H, because it sends... 960 horizontal pixels per line.

      https://videos.cctvcamerapros.com/surveillance-systems/what-...

      [4] Television lines - i.e. how many horizontal lines can the CRT display correctly? Yes, this terminology is VERY CONFUSING and I don't like it. Also, it's measured differently from horizontal pixels.

      3 replies →

  • Displaying content from a DVD on a panel with square pixels (LCD, plasma, etc.) required stretching or omitting some pixels. For widescreen content you'd need to stretch that 720x480 to 848x480, and for 4:3 content you'd need to stretch it to 720x540, or shrink it to 640x480, depending on the resolution of the panel.

    CRTs of course had no fixed horizontal resolution.

    Edit: I just realized I forgot about PAL DVDs which were 720x576. But the same principle applies.

    • When I played an anamorphic PAL DVD on a 4:3 CRT, the picture would look vertically stretched until I pressed the 'aspect ratio' button on the TV.

      This would correct the display, but how did it do it? Was it by drawing the same number of scanlines, but reducing the vertical distance between each line?

      2 replies →

  • Even with modern digital codecs and streaming, there's usually chroma subsampling[1], so the color channels may have non-square "pixels" even if overall pixels are nominally square. I most often see 4:2:0 subsampling, which still has square pixels, but at half resolution in each dimension. However 4:2:2 is also fairly common, and it has half resolution in only one dimension, so the pixels are 2:1. You'd have trouble getting a video decoding library to mess this up though.

    [1]: https://en.wikipedia.org/wiki/Chroma_subsampling

  • As a budding young computer geek and mildly OCD person I always hated the non-square pixels. As far as I recall, square pixels just didn't exist at the time. Maybe at Xerox PARC or somewhere.

    They never quite looked right, and making pixel graphics was a bit of a hassle since your perfect design on graph paper didn't look the same on screen, etc, etc, etc. I mean it wasn't life-threatening, just a tiny source of never-ending annoyances.

    My Macintosh 512e (one of the early "toaster Macs") had square pixels and it was so great to finally have them.

  • DVD also supports 352x480. These pixels are very non square.

    Why would you want this? VHS. NTSC has 480-ish visible scanlines, but VHS only has bandwidth for 350 pixels.

  • Just look at Japanese television… most channels get broadcast at 1440x1080i for 16:9 content instead the full 1920x1080i (to save bandwidth for other things, I assume), so it's still very common with HD too.

    • It may also be due to legacy reasons. Japan was a pioneer in adopting HD TV years before the rest of the world, but early HD cameras and video formats like HDCAM and HDV only recorded 1080i at 1440x1080. If their whole video processing chain is set up for 1440x1080, they’d likely have to replace a lot of equipment to switch over to full 1920x1080i.

  • One of my colleagues as very keen to ensure that 576i material when upscaled/downscaled went via 702 wide with crop/pad to 720 before scale

We have been programming computers for 70+ years.

We still have to discover necessary information for small tasks, like it was a note buried in a stack of papers, in the spare room that has absorbed a decade's worth of clutter.

Another canary I notice, is the presence of "Please don't hit the back button" on web pages served by major corporations. Something bad might happen if you click/touch/return that button! Hands off your input devices, please!

On the progress front, we know how to topologically layer information, like never before. Huge appearing/disappearing header bars, over popup ads, over content. Such screen space efficiency.

In some ways we have come far. In a truly remarkable number of ways, not so much.

This reminded me of retina screenshots on mac — selecting a 100×100 area can produce a 200×200 file. Different cause but same idea - the stored pixels don’t always match what you see on screen.

  • This is indeed similar in the effects, but completely different in the cause to the phenomenon referenced in the article (device pixel ratio vs pixel aspect ratio).

    What you're referring to stems from an assumption made a long time ago by Microsoft, later adopted as a de facto standard by most computer software. The assumption was that the pixel density of every display, unless otherwise specified, was 96 pixels per inch [1].

    The value stuck and started being taken for granted, while the pixel density of displays started growing much beyond that—a move mostly popularized by Apple's Retina. A solution was needed to allow new software to take advantage of the increased detail provided by high-density displays while still accommodating legacy software written exclusively for 96 PPI. This resulted in the decoupling of "logical" pixels from "physical" pixels, with the logical resolution being most commonly defined as "what the resolution of the display would be given its physical size and a PPI of 96" [2], and the physical resolution representing the real amount of pixels. The 100x100 and 200x200 values in your example are respectively the logical and physical resolutions of your screenshot.

    Different software vendors refer to these "logical" pixels differently, but the most names you're going to encounter are points (Apple), density-independent pixels ("DPs", Google), and device-independent pixels ("DIPs", Microsoft). The value of 96, while the most common, is also not a standard per se. Android uses 160 PPI as its base, Apple has for a long time used 72.

    [1]: https://learn.microsoft.com/en-us/archive/blogs/fontblog/whe...

    [2]: https://developer.mozilla.org/en-US/docs/Web/API/Window/devi...

    • I might be misunderstanding what you're saying, but I'm pretty sure print and web were already more popular than anything Apple did. The need to be aware of output size and scale pixels was not at all uncommon by the time retina displays came out.

      From what I recall only Microsoft had problems with this, and specifically on Windows. You might be right about software that was exclusive to desktop Windows. I don't remember having scaling issues even on other Microsoft products such as Windows Mobile.

      1 reply →

    • Why does the PPI matter at all? Thought we only cared about the scaling factor. So 2 in this 100 to 200 scenario. It's not like I'm trying to display a true to life gummy bear on my monitor, we just want sharp images.

      1 reply →

    • I think resolution always refers to physical resolution of the display. But rendering can be using scaling to make things appear to the user in whatever real size regardless of the underlying resolution.

      1 reply →

I’m no expert but this sounds like a digital version of the anamorphic lens/system, doesn’t it?

  • It is.

    Some modern films are still filmed with anamorphic lenses because the director / DP like that, and so we in the VFX industry have to deal with plate footage that way, and so have to deal with non-square pixels in the software handling the images (to de-squash the image, even though the digital camera sensor pixels that recorded the image from the lens were square) in order to display correctly (i.e. so that round circular things still look round, and are not squashed).

    Even to the degree that full CG element renders (i.e. rendered to EXR with a pathtracing renderer) should really use anisotropic pixel filter widths to look correct.

  • Yes, and when working with footage shot with anamorphic lenses one will have to render the footage as non-square pixels, mapped to the square pixels of our screens, to view it at its intended aspect ratio. This process is done either at the beginning (conforming the footage before sending to editorial / VFX) or end (conforming to square pixels as a final step) of the post-production workflow depending on the show.

Proving that everything is more complicated than you first think it is when you lift up a corner of the rug.

> It’s especially common in vertical videos like YouTube Shorts, where the stored resolution is a square 1080 × 1080, and the aspect ratio makes it a portrait.

My guess is this is because encoding hardware can do max 1920x1080, and there is no easy way to make that hardware encode 1080x1920, so you are forced to encode as 1080x1080. Swapping rows and columns in hardware tends to be a big change because caches and readahead totally changes when you process the data in a different order.

  • They've supported 4K for so long that I'd be surprised if they don't have enough boards capable of 1920 height.

    And even then, why make it 1080 wide?

    I feel like there's more going on. And maybe it's related to shorts supporting any aspect ratio up to 1:1.

    But that's all assuming the article is giving an accurate picture of things in the first place. I went and pulled up the max resolutions for three random shorts: 576x1024, 1080x1920, 1080x1920. The latter two also have 608x1080 options.

My first thought was that pixels are never square. Squares are an artifact of nearest sampling to another grid. I suppose pixel art assumes knowledge of this final grid, but most media doesn’t?

Furthermore, the referencing of a raster can assume any shape or form. It makes some sense some signals are optimized for hardware restrictions.

Another interesting example are anamorphic lenses used in cinema.

I'm reminded of how 720p Plasma TVs had input resolutions of 1024x768 - the pixels themselves were rectangular

Am I missing the obvious, but it seems like the author is messing with the aspect ratio.

  • No the author is highlighting the fact that the aspect ratio a video is stored in doesn’t always match the aspect ratio a video is displayed in. So simply calculating the aspect ratio based on the number of horizontal and vertical pixels gives you the storage ratio, but doesn’t always result in the correct display ratio.

  • Yes I think they are conflating square pixels with square pixel aspect ratios.

    If a video file only stores a singular color value for each pixel, why does it care what shape the pixel is in when it's displayed? It would be filled in with the single color value regardless.

    • Because if that pixel takes up 2 vertical pixels when displayed in your web browser... That takes up more space and causes layout shift.

      I thought i understood the article just fine but these comments are confusing.

SAR vs DAR is what i had to learn when working with ffprobe, among other things.