I have just run a larger crawl. Now some of my seed sites have a rinset value of 2 or even 3. Also, I set the depth of crawl to 4, yet have no entries with a ringset value of 4.

  • anon


    Thanks for your question and for being a VOSON subscriber.

    Unfortunately, the Ringset variable/attribute is currently not correct. What it should do is: all seed sites should be categorised ringset=1, and then sites that connect to seed sites (i.e. they link to a seed site or are linked to by a seed site) should have ringset=2.

    But this is not working properly and as you are finding there are sometimes values > 2.

    If this is a problem for your research, please email support@uberlink.com and we can make a fix to your particular database.

    Finally, you mentioned that you are surprised that you don't have ringset=4 even though you set crawl depth to 4. Apart from problem mention above about ringset not being set correctly, even if it was, it has nothing to do with the crawl depth parameter. Crawl depth influences the crawler behaviour within a given website. depth (pages) indicates how many pages the crawler will crawl until it gives up (if another constraint hasn't already been hit). depth (levels) indicates how many levels the crawler will go to (within the website), with the entry page being level 1, pages that the entry page link to being level 2 etc.

    So crawl depth has nothing to do with ringset. Ringset indicates whether a site is a seed site or not.

    I hope this clarifies things.

    Best regards,

    Apr 11, 2018
  • anon

    Thank you, this was really helpful.

    Apr 12, 2018