Found an interesting Search Engine Optimization (SEO) article / Twitter conversation about soft-404s vs redirects through a Search Engine Roundtable Report and it got me thinking...
The resulting MADness is this SEO rant based on the best practices for publishing new site page drafts and placeholder pages.
First, here's a quick summary of the conversation and article.
What would you consider to be a good approach for dealing with temporarily empty category pages?
"My site adds no-index to avoid showing as soft 404s until products return to them. My worry is that switching between no-index/index is worse than a soft 404."
"I don't think there would be any noticeable difference between those two approaches. When things change (empty -> content), we usually get signals from various places (suddenly internal links), so we can use that to pick up crawling when needed."
What's he saying above?
Neither.
What he's hinting at is an empty page is better than no page at all.
Perhaps he's saying publish the page, and simply don't submit it to a search engine. Instead, it seems he's eluding to removing it from the sitemap... and a somewhat given warning to not link to it from the live site pages or menus yet.
Are there differences between return crawl rate between no-index and soft 404 pages and how quickly can web pages be made indexable again?
This wasn't answered, I've never really seen Google experts answer super specific questions like this... or, rather, I've never seen them give super specific answers to SEO questions.
Here's an opinion on it:
NEVER use soft 404's OR redirects for URLs that you eventually plan on updating and going live with.
Why...?
Do you think Google wants you to notify the Search Engine of a new web page before it has content that the Search Engine shouldn't have already known about?
Absolutely not. Google servers are using bandwidth when to index web pages page. If you cause them to use that bandwidth to index a page, which is simply a soft 404 / re-direct, it's a waste of computing power.
The Google Algorithm, focused on scale and efficiency, should likely take that into account and would devalue the page.
Besides, Google ALREADY KNOWS your page is published!
How...?
The tracking code / tag and collect the data to be parsed to your Analytics or other tag-connected accounts on a web page.
This is code set is triggered when YOU or anyone else visits the page, assuming the user's IP address isn't excluded from tracking... but even in that case still it still has to be triggered to log the activity that needs to be excluded from end-user level views and reporting. Because what Google knows and what Google allows you to see they about the site are much, much different. Enough 'conspiracy' theories.
We simply have to expect / assume that the Google Algorithm recognizes the new page getting published without even a traditional index needed. Not just based on the above, but probably a whole lot more not being thought about.
With that assumption, Any visitor will initiate the tracking code / tag and collect the data to be parsed to your Analytics or other tag-connected accounts.
Let's circle back on the initial thinking of the question was great though... What's the best practices for publishing page shells down by Larry Page's search store?
Once the content is eventually inserted or updated, optimized with on-page SEO best practices, and ready for public viewing from people (and robots), then:
It's also worth noting your Google Analytics Account should already be connected to your Google Search Console account. This way, once Google indexes the URL, it will have at least a light page domain history and associated user metrics from the Analytics data (or a related data set via tag-connected Google Apps).
If you read this article and, it didn't make any sense, you should look for SEO services.