BlogRank changes

We have made a few changes to our BlogRank algorithm in order to make it easier to distinguish between the blogs at the top of the BlogRank scale.

The previous version of the algorithm had the trait of slowly collecting more and more blogs at the top of the BlogRank scale as our blog index grew larger.

This has now been adjusted, making sure only the few most popular blogs has BlogRank 10. For blogs having a low BlogRank this change won’t be noticeable, but a significant part of the high-ranking blogs will get a lower BlogRank, which can be seen in the graphs below:

BlogRank 2-6 distribution for Swedish blogs before and after the change

Distribution of Swedish blogs with BlogRank 2-6 before and after the change.

BlogRank 6-10 distribution for Swedish blogs before and after the change

Distribution of Swedish blogs with BlogRank 6-10 before and after the change.

For the exact numbers on how much the BlogRank distribution has changed for all Swedish blogs, see the table below:

BlogRank Number of blogs before Number of blogs after
1 101,380 104,262
2 4,295 3,995
3 2,081 1,435
4 1,170 609
5 694 279
6 429 129
7 299 54
8 177 19
9 141 13
10 137 8

For details about BlogRank, authority and top authority see the ranking documentation.

API improvements

In both of our APIs, LiveFeed and Search, we truncate certain post’s summary field for stability and performance reasons. It was recently discovered that the truncation had two problems. The first, which would leave undesired HTML entities in the document under certain circumstances. The second, which could potentially cut off the last word when truncation occurred.

On 2016-10-11 we rolled out a fix which remedies both of these flaws. You should not find HTML entities in the summary field anymore (please tell us if you still find any!) and the last word of a truncated summary should be left intact. Note that the vast majority of posts are not, and will never be, truncated.

Another recent improvement is also related to undesired HTML entities. The tags field, only present in Search API, should not contain HTML as of 2016-09-08.

TLS 1.1 and TLS 1.2 fix

Due to a misconfiguration in our retrieval system we have not been able to ingest feeds using only the TLS 1.1 or TLS 1.2 encryption protocols. Since most encrypted sites still serve TLS 1.0, or even SSL 3.0, the misconfiguration has most likely not been noticeable in our ingestion statistics, but we expect to see more sites deprecating older protocols in favor of TLS 1.2.

The configuration has been updated and we are all feeling a bit more secure.

.NET client for Search API

We have published a .NET client for the Search API, grab it from NuGet.

Java client for Search API

We have released a new rewritten version of our Java client for the Search API, get it now.