Little Bits of Cheese – Page 4 – Kraft's Random Short Thoughts

March 25, 2021

Just because I like old protocols: has anyone heard of a NNTP-WordPress bridge?

I kinda want to put together one for a p2-ish multisite.

Microblog
March 14, 2021

Picked up a dirty cold brew at @BookPeople while grabbing an online order. The jolt is exactly what my Sunday needed.

Microblog

coffee
March 9, 2021

Don’t speed.

Microblog

speed demon
March 7, 2021

One thing I like about Tumblr is the post queue concept. Like scheduled posts/tweets, but with less work.

Write a lot of notes and just have Tumblr drip them per your rules.

Microblog

meta, tumblr
March 7, 2021

TIL: Jamba Juice sells pretzels. 🥨

It’s going to take awhile for my mind to process that.

Microblog

jamba juice, pretzles
March 5, 2021

How Post Content is Stored on Tumblr
macmanx:
engineering:

We’re currently rolling out an opt-in beta for a new post editor on web which will leverage the Neue Post Format behind the scenes. It’s been a very long time coming – work on the Neue Post Format began in 2015 and was originally codenamed “Poster Child”, and it was borne out of a lot of things we learned dealing with the previous new post editor we released on web around that time. Over the years, the landscape of how people make posts on different platforms across the internet has changed dramatically. But here on Tumblr, we still want to stay true to our blogging roots, while giving access to a wide creative canvas, and the Neue Post Format reflects that work.

With literally billions (tens of billions!) of posts on Tumblr, how do we move this churning engine of content from one format to another without breaking everything? It took many phases, and releasing the new editor on the web will be one of the final pieces in place. To understand how far we’ve come and the challenges we’ve had to face, you need to know the deep dark secrets of how we store post content on Tumblr. This hellsite we all love is held together by duct tape, good intentions, and luck, and we’re constantly working to make it better!

A post is seemingly a very simple data model: it has an author, it has content, and it was posted at a certain time. Every post has a unique identifier once it’s created. In the case of reblogs, they also have the “parent” post and blog it was reblogged from (more on How Reblogs Work over here). In a standard normalized database table, these columns would look like:

Post identifier (a very big integer)

Author blog identifier (an integer pointing to the “blogs” database table)

Parent post identifier (if it’s a reblog)

Parent blog identifier (if it’s a reblog)

When it was posted (a timestamp of some kind)

Post content (more on this in a minute)

Before the Neue Post Format, posts had discrete “types”, so that’d be a column here as well. But once you have these discrete “types”, you have to determine how you want to store the content of each “type”. For photo posts, this is a set of one or more images. For video posts, this is either a reference to an uploaded video file, or it’s a URL to an external video. For text posts, it’s just text, in HTML format. So the actual value of that “post content” column can change depending on what type it is.

Here’s a simple example, note how each post type has different kinds of content:

As Tumblr grew, its capabilities grew. We added the ability to add a caption to photo, video, and audio posts. We added the ability to add a “source” to quote posts. We needed somewhere to store that new post content. Because Tumblr was growing so rapidly at the time, this needed to happen fast, so we took the easiest path available: add a new column! That first “post content” column was renamed “one”, and the new post content column was named “two”. And as Tumblr grew more, eventually we added “three”. And each column’s value could be different based on the post type.

Needless to say, eventually this made it very difficult to have consistent and easy to understand patterns for how we figure out things like… how many images are in a post? Since we added the ability to add an image in the caption, it’s possible there’s images in the “one”, “two”, or “three” columns, but each may be in a different format based on the post type. Reblogs further complicate the storage design, as a reblog copies and reformats post content from its parent post to the new post. The code to figure out how to render a post became extremely complicated and hard to change as we wanted to add more to it.

Further complicating this was the fact that most (but not all) of these post content fields leveraged either HTML or PHP’s built-in serialization logic as the literal data format. Before PHP 7, HTML parsing in PHP (which is what Tumblr uses behind the scenes) was extremely slow, so rendering a post became more of a struggle as the post’s reblog trail grew or its post content complexity increased. And HTML and PHP’s serialization logic isn’t easily portable to other languages, like Go, Scala, Objective-C, Swift, or Java, which we use in other backend services and our mobile apps.

With all this in mind, in 2015, two needs converged: the need to have a more easily understandable and portable data format shared from the database all the way up to the apps, and the need for more types of post content, decoupled from post type. The Neue Post Format was born: a JSON-based data schema for content blocks and their layout. This has afforded us the flexibility to make new types of content available faster, without needing to worry necessarily about how we’ll store it in HTML format, and has made the post content format portable from the database up to the Android app, iOS app, and the new React-based web client.

Going back to the standard, normalized database table schema for posts, we’ve now achieved the intended simplicity with a flexible JSON structure inside that “post content” column. We no longer need post types at all when storing a post. A post can have any and all of the content types within it, instead of being siloed separately with a myriad of confusing options depending on the post type. Now a post can be a video and photo post at the same time! When the new editor on the web is fully released, we can finally say that this format is the fuel powering the engine of content on Tumblr. It’ll enable us to more quickly build out block types and layouts we couldn’t before, such as polls, blog card blocks, and overlapping images/videos/text. Sky’s the limit.

– @cyle

tl;dr

“We no longer need post types at all when storing a post. A post can have any and all of the content types within it, instead of being siloed separately with a myriad of confusing options depending on the post type. Now a post can be a video and photo post at the same time!”

Coming soon! (or switch on the beta for yourself the next time you’re prompted)
Microblog
March 4, 2021

Do you lie to your kids?

I tried to act like cauliflower rice was good when the place tonight gave it to all the kids instead of regular rice.

Microblog

daddy lies, parenthood
March 2, 2021

Happy Texas Independence Day!

Microblog

1836, texas, texas independence day
March 1, 2021

I dunno, man. I’d think it pretty important to have someone on site know how to flip the switch to backup power.

Nobody knew how to restore power at Ullrich Water Treatment Plant during the freeze. It was out for three hours

Microblog
February 28, 2021

My girls will mess you up.

Microblog

archery, 🏹

TIL: Jamba Juice sells pretzels. 🥨

How Post Content is Stored on Tumblr

I dunno, man. I’d think it pretty important to have someone on site know how to flip the switch to backup power.