Yandex Search Ranking Factors

Showing 1923 of 1923 factors ~ 100% of all factors

P R

#0
Page rank. The factor is remapped.

T R

#1
Textual relevance (maxfreq - the frequency of the most frequent word, which makes sense of the length of the document).

L R

#2
Link Relevance. The factor is remapped.
Priority bonus, priority 7 - text priority. Factor is binary, has value 0 for all single word queries, and value 1 for almost all two or more word queries, except for a very small number of responses, for which there are no links that passed the quorum, and the text did not pass the quorum either.
Priority strict for TR is text priority - there are all query words somewhere in the document (and they pass contextual restrictions of the query, for example, both words d.b. in the same sentence).
The phrase priority for TR is text priority - there are all query words in a row in the document.
(strict) have all query words in one link.
(phrase) have all query words in a row in one link.
The presence of the exact phrase (query text) in the title (to be exact, in the first sentence of the document). Context constraints and stop words are taken into account exactly as in TRp2, i.e. factor[8] minors factor[5]
A quorum site was encountered in which all word positions are marked as having BEST_RELEV relevance (header or meta keywords).

News

#11
This is news (determined by distinctive ((http://wiki.yandex-team.ru/JandeksPoisk/KachestvoPoiska/ObshayaFormula/TekushhieKomponenty/Klassificacionnye?v=tkd#h45859-3 patterns in url)) ).

Shop

#12
This is a store offer (determined by the characteristic ((http://wiki.yandex-team.ru/JandeksPoisk/KachestvoPoiska/ObshayaFormula/TekushhieKomponenty/Klassificacionnye?v=tkd#h45859-4 patterns in url`)) ). Not used (deprecated)

Cat

#13
This is a directory (determined by characteristic ((http://wiki.yandex-team.ru/JandeksPoisk/KachestvoPoiska/ObshayaFormula/TekushhieKomponenty/Klassificacionnye?v=tkd#h45859-2 patterns in the url)) or by the Yandex directory).
Attendance from Bar - ((http://wiki.yandex-team.ru/AndrejjKostjagin/YaBarLog/HostStat Data Description)). The factor is remapped.

Long

#15
Long document (the longer the document, the greater the value of the factor).
Hitweigt is a variant of textual relevance, in which the weights of all hits are considered equal (i.e. no premiums for title and word proximity are taken into account). In this case the relevant hits must pass the constraints of the syntactic wizard, i.e. we can assume that the TRhitw factor is 0 if and only if SoftAndOk is 0
The sum of the idf of the query words. The name does not reflect the essence: for example, for the query 'Gadyach' this factor will be greater than for the query 'Moscow Peter Yekaterinburg Samara'.
Long text without references.

Root

#19
It's a muzzle.

Geo

#22
Indicates a match between the user's region and the site at the country level. The factor is binary: 1-match, 0-no. Based on ((http://wiki.yandex-team.ru/ЯндексПоиск/КлассификацияСайтовИСтраниц/Географическая/ИспользованиеВПоиске geoclassification of sites))
Matching thematic spectra of the query and the document. Subject of the query is the result of work ((http://wiki.yandex-team.ru/EvgenijjKroxalev/subquery SubquerySearch wizard rules)) Subjects of the document are taken from the Yandex catalog

S R

#24
A complex static rank, assembled from static components by a separate formula((http://wiki.yandex-team.ru/JandeksPoisk/KachestvoPoiska/ObshayaFormula/TekushhieKomponenty/#oftnd1 *)).
Factor about the number of refines. The query language has a user refines ('word preceded by a percent sign') feature. This is supposed to mean something like 'it would be nice to have a word in the document'. The only known ((http://staff.yandex-team.ru/gulin Andrey Gulin)) valuable use of this feature is querying [%official %site FirmName]. This feature is unknown to users, since it is not described in any documentation. It is planned that it will disappear from the query language, but the words with USER_REFINE priority will remain in the wizard. The factor tells you how many maximum USER_REFINE words were encountered simultaneously within a single quorum hit. It is said to be between 0 and 3 (if >3, it is said to be 3). This number is mapped to the half-interval [0,1)
The number by which some link factors (namely, factors number 6, 7, 47, 66) are multiplied if the textual relevance is 0 and there are few links
In textual relevance, a lemma match occurred.
Remapped mascot feature TrafgraphOutAll_share_d
Dssm model, trained on reformulations, uses relevant sentences in the document part
The value of the news detector calculated in behemoth. Always 0 when the detector value is less than the threshold.
The converted number of query words in all url links.
The document LR>20 has the number of occurrences of the query words in the links > 16, factor about LR.
For documents with high LR - normalized link relevance without regard to proximity, for documents with low LR 0
Url high LR.
Quality of incoming references (Leschiner's classifier) - broken, see [405]
CosineMatchMaxPrediction factor value for the AliceMusic stream
Number of incoming links. Remaps.
Popularity of the request
TR divided by the cube of the number of words in the query and converted by the standard remapTR.
The language of the document is Russian.
Page addition time, more is an older document; put the root of the time mapped to the interval [0,1] so that 3+ years gives 1.
If the main page of the owner (most often a second-level domain, such as xxxx.ru), the factor is 1. For bomzhatniki, hosting, personal blogs, etc. (eg, Lyfjornal, narod.ru, etc.) - third-level domains (such as xxxxx.narod.ru) will also have a factor of 1.
The owner (host?) main page addition time, remaps in the same way as AddTime.
The value of the AnnotationMaxValueWeighted factor for the AliceMusic streamer
How often the URL is clicked on this query - CTR multiplied by the correction factor
Simple BM25 by text.
Simple BM25 by links, link weights are not taken into account.
Simple BM25 by text and links at the same time.

T Lp1

#49
All query words are in the text + links.

Adv

#50
There are ads on the site.
There are Yandex ads on the site.
Spam classifier by anti-spam chips recognized the site as NOT(!) spam. I.e. 0=spam, 1=good.
Simple BM25 by word pairs - we take all pairs of query words and count the number of their occurrences in the text of the document. We use sum of word weights as pair weight. Comm Doesn't work if query has stop word
Same as TxtPair, but for links; link weights are not taken into account.
BM25 from the number of sentences in the document in which it occurs.
BM25 by the words in the title only.
BM25 on words only with high rel bits ('significant', with highlighting (<b>, etc.)).
Min(number of query words/10, 1.f)
1 / number_words_in_request.
The document does not have a TR.
The document does not have LR.
There is no information about clickability for this url for this request 1 - request or request-url is not in the clickbase, 0 - request-url is in the clickbase
For this query there is no information about clickability 1 - the query is not in the clickbase, 0 - the query is in the clickbase.

Hops

#65
The number of hops of the url in a roundtrip (like less - closer to the muzzle, the smaller the value (0 - muzzle, 1 - cannot be reached from the muzzle, 0 < can be reached from the muzzle < 1). Normal value for nost root is 0.0039).
Logarithm from LR, linearly mapped in [0,1].
presence of word pairs in exact form
the number of sentences in which there are many words in the exact form
the presence of words in the title in the exact form
BM25 in exact form
A simple BM25 in precise form.
presence of word pairs with synonyms (>=TxtPair)
the number of sentences in which there are many words with synonyms taken into account
the presence of words in the title, taking into account synonyms
BM25 including synonyms
Simple BM25 with synonyms in mind.
How often the URLs of the given domainId are clicked on the given query - CTR domainId multiplied by the correction factor
For this domainId for this query there is no information about clickability 1 - request or request-owner is not in the clickbase, 0 - request-owner is in the clickbase
Clickability of the owner regardless of the request
Relative frequency of query words in links (1 - query words often occur in links, 0.3 - rarely); more precisely, the value of this factor is pessimized if: TR=0 && LR=0 && (no links with all query words) && (no quorum) && (at least one pair of query words occurs in the text)
The links have all the words of the query
One link has all the words of the query
There is a link that passed the quorum
What proportion of links are "good"
How many "bad" links (bad = dpr = 0)
Maximum dpr reference
TfIdf is usual TF*IDF by links. The word frequency in the references is multiplied by the inverse document frequency and summed over all words, then normalized to the document length.
Link relevance by Gulin
Link Relevance by Gulin
Link relevance by Gulin
There is an exact form of all query words in the text/links
There is a lemma of all query words in the text/links
The document passed softand by the syntax wizard's constraints. Only for documents with textual relevance. For single-word queries it is always 1.
Incoming link quality classifier 2 - broken, see [407]
equals one if the site has a Ukrainian geo-attribute (ie, 1 - Ukrainian site)
Blog page
Page from livejournal.com

Spam2

#99
Alexeyev's automatic spam classifier, probability that the site is spam (0 not spam, 1-spam)
Text quality. Calculated according to a rather complicated formula
Text quality (Alekseev's classifier)
The core audience of owners according to Yandex.Browsing
Host audience kernel according to Yandex.Browsing
Does the host have a kernel
Spam name karma of antispammers - probability that the host is spam; based on whois information
musicality of the request. The results of the work of wizard Anton Konygin.
the number of links that exactly match the query
Document length in sentences
URL length divided by 5
The commerciality of the query according to the Direkta phrase dictionary: 0 - maximum commerciality, 1 - minimum commerciality.
Raskovalov's host size in the documents without taking into account the doubles (each doubling is counted in the factor by an independent document)
Document type - HTML
The number inverse of the variance of the times of occurrence of links with the query words
Link relevance with thematicity
Link relevance with thematicity
Link relevance with thematicity
Link relevance taking into account the quality of each link
Link relevance taking into account the quality of each link
Link relevance taking into account the quality of each link
Link relevance, taking into account the quality of each link and the thematicity of each link
Link relevance, taking into account the non-commerciality of each link
Link relevance, taking into account the non-commerciality of each link and thematicity
Link relevance, taking into account the non-commerciality of each link and the quality of each link
Link relevance, taking into account the non-commerciality of each link, the quality of each link and thematicity
Means matching the region mentioned in the query and the found sites at the region level. The factor is binary: 1-match, 0-no. Based on ((http://wiki.yandex-team.ru/ЯндексПоиск/КлассификацияСайтовИСтраниц/Географическая/ИспользованиеВПоиске geoclassification of sites))
Percentage of incoming links with query words
Percentage of incoming links with all query words
Does the query contain words from yweb/pornofilter/porno.query.
Porn Chick document
A document from a commercially-available book. Not used (deprecated)
fake document
The title of the page contains commercial vocabulary. Not used (deprecated)
page from ru.wikipedia.org
commercial page (Savin's classifier)
the document does not contain all query words (to the nearest synonym)
Percentage of query words in the document (to the nearest synonym)
the document has all the words of the query (accurate to a synonym)
Percentage of query words in links (accurate to synonym)
the links have all the words of the query (accurate to a synonym)
The value of the commerce detector calculated in the behemoth.
TR by pairs of query words in reverse order
LR by pairs of query words in reverse order
TR by pairs of query words through one word in texts
LR by pairs of query words through one word in the texts
percentage of all query words in the text (to the exact form)
the document has all the words of the query (to the exact form)
Degree of centralization of the points from which the query is set

Q Blog

#151
Does the query contain blog language?
log(LR, narrowed by the user's country)
log(LerfLR, narrowed to the user's country)
Binary non-commerciality: QueryNonCommerciality > 0.965.
Number of links that match the query text (other remap)
XLerfLRlogRelev (normalized by the sum of the Lerf-weights of all links, not by the sum of their initial weights)
XNonCommLRlogRelev (normalized by the sum of the NonComm weights of all references, not by the sum of their initial weights)
Link relevance, taking into account the non-commerciality of each link and thematicity
XNonCommLerfNormLRlogRelev (normalized to the sum of NonCommLerf-weights of all links, not the sum of their original weights)
Link relevance, taking into account the non-commerciality of each link, the quality of each link and thematicity
Not used Content Duplication. The 'goodness' of a host (0 to 1), calculated based on how many and which hosts borrow content from this one.
Not used Content Duplication. Host 'badness' (0 to 1) - proportional to the number of secondary content on the host.
Average age of links that contributed something to LR LinkAge=Min(log(average link age)/7, 1), for 1 took 3 years

T Len

#164
Page text length in words TLen = Map(number of words, 1/400), where Map(x, y) = x*y / (1 + x*y)
The page is unreachable via links from the muzzle.
LR with reference and query language matching
LR taking into account the coincidence of the language of the link and the request and the tipping
The ratio of the number of clicks on the given url to all clicks on the request
The ratio of the number of clicks on the given domainId to all clicks on the query
[Bug: Copy Factor 45] How often a given URL is clicked on - CTR multiplied by the correction factor
What part (on average per session) of clicked on this request with user's city added to it is this url. It is counted by user's sessions.
How often a given URL is clicked on for a given query - CTR multiplied by the correction factor, by small regions from relev_regions.web.txt
How often the URLs of a given domainId are clicked on for a given query - CTR domainId multiplied by the correction factor, by small regions from relev_regions.web.txt
the ratio of the number of clicks on the given url to all clicks on the query, by small regions from relev_regions.web.txt
the ratio of the number of clicks on the given domainId to all clicks on the query, by small regions from relev_regions.web.txt
Query URL Clicks Combo, by small regions from relev_regions.web.txt
Query DOwner Clicks Combo, by minor regions from relev_regions.web.txt
LR by catalog descriptions
LR by unsubscription in Yandex.Catalog
Length of maximum matching forms in text and query
Weight of the maximum form match in the text and query
Length of maximal lemma match in the text and query
Weight of the maximum lemma match in the text and query
Maximum age of a significant accumulation of links that have contributed something to the LR
Variants of the relevant factors, taking into account the stop words
Variants of the relevant factors, taking into account the stop words
Variants of the relevant factors, taking into account the stop words
Variants of the relevant factors, taking into account the stop words
Variants of the relevant factors, taking into account the stop words
TR best passages - how good a snippet can get
TR with a discount for the offer number
Host rank by the most expressed query word (usually the name of the site)
The clickability of domAttr by the maximum word expressed. For example for all queries that have the word wikipedia click on wikipedia.
HostRank by individual words
Clickability of the domain by words
The URL satisfies the FORUM_DETECTOR regularity
AnnotationMatchWeightedValue factor value for the AliceMusic streamer
There is an ancient date in the URL. Ancient news are recognized. Factor 1 if url has year <=2007.
Weight of the maximum form match in the text and query
Weight of the maximum form match in the text and query
The page is about 'paying for SMS'.
Anti-spammers have pessimized the site - all dynamic link factors are zeroed. zerolnk.flt
Shopify the page
The pornographic nature of the page
Remapped mascot feature TrafgraphOutAll_share_m
Remapped mascot feature TrafgraphOutAllSE_share_d
Remapped mascot feature TrafgraphOutAllSE_share_m
Remapped mascot feature NoExtClicksShare
Search engine traffic - conversions from search engines to the site (2nd formula)
Search engine traffic - conversions from search engines to the site (2nd formula)
Visits to the site from search engines for individual words, according to the bar
The value of the BclmMixPlainK000001 factor for the AliceMusic stream
The largest common substring of the url and query, normalized by the length of the url
All matches are in the URL only, no matches in the text of the page
Three levels of matching user and page geography
Three levels of link and query region matching
Geographic proximity
Is the query navigable, by clickability of the answers
The most characteristic query word corresponding to the site, according to the bar
Clickability of the host by the first query word. Quite often the first (last) word of the query is an explicit indication of the site where the information should be searched for.
The value of the CMMatchTop5AvgMatch factor for the AliceMusic stream
The average continuous user time (in seconds) on the host pages after clicking on the query from a search engine (the factor depends on the pair (query,domAttr)).
the average continuous time (in seconds) of user's stay on the host's pages after the query from the search engine (the factor depends on the pair (query,domAttr)). According to Yandex.Bar/Elements/Browser internal counter
the average number of active actions (clicks, keystrokes) by users during a user's continuous presence on the host's pages after switching from the search engine (the factor depends on the pair (query,domAttr)). According to Yandex.Bar/Elements/Browser internal counter
Number of unique visitors from search engines for a particular query
the average continuous time (in seconds) a user is on a page after clicking on a query from a search engine (the factor depends on the pair (query,url)).
the average continuous time (in seconds) a user is on a page after he goes from a search engine (the factor depends on the pair (query,url)). According to the internal counter of Yandex.Bar/Elements/Browser
the average number of active actions (clicks, keystrokes) by users on the page after clicking on the query from the search engine (the factor depends on the pair (query,url))
A pool of PRS logs is tagged using Bert trained on sinsig. The dssm model is trained on this pool, using BaseRegionChain
A pool of PRS logs is tagged using Bert trained for relevance. The dssm model is trained on this pool, using BaseRegionChain
PerWordCMMaxMatchMin factor value for the AliceMusic stream
Value of the AttenV1_Bm15_K05 factor for the AliceMusic stream
The value of the AnnotationMaxValueWeighted factor for the AliceMusic streamer
The request is not in Russian
document from a foreign cluster
Page region size
The factor is inversely proportional to the size of the page region
Request region size
The factor is inversely proportional to the size of the region of the request
Geographical proximity of the user and the site
Characterizes the promotion of the site by link rings. The value is the share of external links that are included in the link rings and link exchanges.
the number of unique visitors, remaps exponentially
Share of traffic from search engines
the share of visits to the site not through links (hand-dialed or bookmarked)
the average continuous active user time (in sec) on the host pages
the average continuous user time (in sec) on the host's pages. According to the internal counter of Yandex.Bar/Elements/Browser
The average number of active actions (clicks, keystrokes) by users when a user is continuously on the host pages (in sec).
implementation of the algorithm described in the article ((http://wiki.yandex-team.ru//h.yandex.net/?http%3A%2F%2Fresearch.microsoft.com%2Fen-us%2Fpeople%2Ftyliu%2Ffp032-liu.pdf http://research.microsoft.com/en-us/people/tyliu/fp032-liu.pdf))
Attendance of the url according to me-bar data
Number of unique visitors to the url
The average time of user's presence on the page. It is counted as the difference between adjacent transitions.
This is SEA factor = s4_r/ (k_r+10) where s4_r - number of clicks > 180 sec, k_r - total number of clicks. It is calculated taking into account the reformulations.
This is SEA factor = s4_r/ (k_r+10) where s4_r - number of clicks > 180 sec, k_r - total number of clicks. It is calculated taking into account the reformulations. Localized version
The degree of diversity of queries clicked on this url
The page is commercial by keyword. Not used (deprecated)
Idf by different parts of the document, broken, not used
Idf by different parts of the document, broken, not used
Idf by different parts of the document, broken, not used
Idf by different parts of the document, broken, not used
The link factor about having a video on the page.
BM25 by user region for localizable queries, for non-localizable queries in CUBE - country. Texts of queries sent for regions can be viewed in relev_regions.txt in the wizard
Same for link relevance
Share of incoming sales links. An algorithm for recognizing commercial links has been implemented. The factor is remapped to [0,1] if the share of such links > 50%, otherwise 0. ((http://wiki.yandex-team.ru/SvetlanaShorina/topseolinks sample of cheated sites))
The previous factor multiplied by PornoQuery
CommLinksSEOHosts factor multiplied by NonCommercialQuery
The query mentions a product category. Not used (deprecated)
The query mentions a vendor. Not used (deprecated)
Geographical distribution of the request
The request is mostly made at night
The request is made mostly in the morning
The request is mostly made during the day
The request is mostly made in the evening
The severity of querying at different times of the day

L Cor

#281
Characterizes the frequency of words in links. The factor is large if the word played in the link relevance is rare for links.
Matching thematic spectra of the query and the document. The subject of the query is the result of the work ((http://wiki.yandex-team.ru/EvgenijjKroxalev/subquery SubquerySearch wizard rules)) The topic of the document is determined by the automatic classifier
Weight of query words that are in the text
Weight of query words that are in the links
Weight of query words that are in the text and links
Entropy - click distribution
Entropy - distribution of displays
Entropy - distribution of clicks/shows ratio
Document porn on the text of the link
Document porn on the text of the link, a different rationing
PornoQuery classifier, a different dictionary than PornoQuery
Value of the AttenV1_Bm15_K05 factor for the AliceMusic stream
Geographical proximity of the country of the site and the country of the request
Covering the domain with three letters from the query. (Chelyabinsk lottery - chelloto. Translate the query into transliteration, find the three letters that are covered (che, hel, lot, olo), see what proportion of all three letters are covered.)
Same as the previous factor, but about the whole url except the domain
The query is locale-specific. The query is often reformulated with an explicit region assignment. ((https://ml.yandex-team.ru/archive/thread1433892/#message1433892 more info))
We count text features, assuming that the page's title is assigned to each of its sentences, i.e. the distance between a word from the title and any other word is 1 sentence. Len - maximum ratio of words from the query found in some sentence of the text (with the assigned title) in relation to the length of the query. Пример [Хармс цирк Вертунов] для ((http://wiki.yandex-team.ru//h.yandex.net/?http%3A%2F%2Fwww.wikilivres.info%2Fwiki%2F%25D0%25A6%25D0%25B8%25D1%2580%25D0%25BA_%25D0%25A8%25D0%25B0%25D1%2580%25D0%25B4%25D0%25B0%25D0%25BC_%28%25D0%25A5%25D0%25B0%25D1%2580%25D0%25BC%25D1%2581%29 этого документа))
The ratio of the sum of the idf of the encountered words in the sentence+title to all words.
The same as JokerLen, on the exact forms
Same as JokerWeight, by exact shapes
Remapped mascot feature More120SecVisitsNotSearchShare
Analogs to the corresponding text factors for links. BM25 from the number of links in which there was a match.
Simple BM25 on the exact form in the reference texts
The presence of word pairs in links, taking into account synonyms
Number of links that passed the threshold
Simple BM25 by links with synonyms
Video request
Clickability of the owner regardless of the request, separately by region
Entropy - click distribution. Regionalized
Entropy is the distribution of displays. Regionalized
Entropy - distribution of clicks/shows ratio. Regionalized
equals 2 * NastyContent
equals 2 * NastyContent
always zero

Is Com

#315
.com domain

Is Ua

#316
Domain in the zone .ua
The domain is not in the .ru zone
LR by links from Yandex.Market

Poetry

#319
The poetry of the document
Maximum poetry of the quatrain
The language of the document is English
The query is completely covered by two exact groups consisting of exact match words of the query in a row ((http://wiki.yandex-team.ru/poiskovajaplatforma/tr/CoverageByGroups Progroup coverage))
There is a group consisting of exact match words of the query, covering the query (possibly with an omission, addition or substitution of a word)
The fraction of the query covered by the longest group consisting of any hits (including word forms and synonyms). Possibly with omission, addition or substitution of a word
Characterizes the proximity of time profiles of the request and documents on business days
Characterizes the proximity of time profiles of the request and documents on weekends
The language of the document is Cyrillic
Query factors - the result of ((http://wiki.yandex-team.ru/JandeksPoisk/KachestvoPoiska/ObshayaFormula/TekushhieKomponenty/GeoRegionality query geolocalization classifier))U- geo-relevant - regional output by query is meaningless
R- georelevant - regional results in the output could be useful, but no more than that
V- geovital - regional issuance is fundamental
There are no numbers in the url
The value of the AllWcmMaxMatch factor for the AliceMusic stream
CosineMatchMaxPrediction factor value for the AliceMusic stream

Syn S1

#334
Indicates how unnatural the text is from the point of view of the Russian language. Evaluate how much of the document text can be considered synonymizer-generated or automatic at all. ((http://wiki.yandex-team.ru/JandeksPoisk/KachestvoPoiska/ObshayaFormula/TekushhieKomponenty/antispam?v=1il#h58953-2 more details))
Indicates how unnatural the text is from the point of view of the Russian language. Evaluate how much of the document text can be considered synonymizer-generated or automatic at all. ((http://wiki.yandex-team.ru/JandeksPoisk/KachestvoPoiska/ObshayaFormula/TekushhieKomponenty/antispam?v=1il#h58953-2 more details))
Indicates how unnatural the text is from the point of view of the Russian language. Evaluate how much of the document text can be considered synonymizer-generated or automatic at all. ((http://wiki.yandex-team.ru/JandeksPoisk/KachestvoPoiska/ObshayaFormula/TekushhieKomponenty/antispam?v=1il#h58953-2 more details))
nd/k normalized time to click
selected formula
r_s4b/(r_k + 10)
Does the query have full parsing
The date of the document, which is written on the page, is remapped by the square root
Remapped mascot feature VisitsPVisitors
Additional factors about the promotion of the site link rings , ((http://wiki.yandex-team.ru/JandeksPoisk/KachestvoPoiska/ObshayaFormula/TekushhieKomponenty/antispam?v=181r#h58953-4 more info))
Additional factors about the promotion of the site link rings , ((http://wiki.yandex-team.ru/JandeksPoisk/KachestvoPoiska/ObshayaFormula/TekushhieKomponenty/antispam?v=181r#h58953-4 more info))
Additional factors about the promotion of the site link rings , ((http://wiki.yandex-team.ru/JandeksPoisk/KachestvoPoiska/ObshayaFormula/TekushhieKomponenty/antispam?v=181r#h58953-4 more info))
The document has textual relevance
BM25, where the 'words' are the highlighted query segments
The 'weight' of the query segments in the text
Indicator of the unnaturalness of the text from the point of view of the Russian language. The number of bad word pairs in the text, renormalized in the interval [0,1] by the formula z/(z+10)
Proportion of bad pairs among all pairs found in the table: z/(x+1), where z is the number of bad pairs in the text, and x is the number ((http://wiki.yandex-team.ru/EvgenijjGrechnikov/TestSynonimizers 2000-relevant)) of pairs
the number of Latin letters in the text (not counting the markup), cornered in [0,1] by the formula n/(n+100)
Additional factors about the promotion of the site link rings , ((http://wiki.yandex-team.ru/JandeksPoisk/KachestvoPoiska/ObshayaFormula/TekushhieKomponenty/antispam?v=181r#h58953-4 more info))
Previous factors - corrected
Previous factors - corrected
Previous factors - corrected
Previous factors - corrected
factor, cleverly combined from FRC and pseudo-CTR
factor, cleverly combined from FRC and pseudo-CTR
Link relevance with pessimization for link age
The number of words in the text (the Word is what the lemmer highlighted), is mapped to [0,1] by the formula x/(x+A)
Number of Russian words in the title
Average word length
Percentage of words inside the <a>...</a> tag of all words
Percentage of words outside the tags (outside the <> brackets) of all words
Percentage of words that are the 200 most frequent words in the language from the number of all words in the text
Number of the 500 most popular language words used in the text, divided by 500
The logarithm of the geometric mean probability of trigrams in the text. (the probability of a trigram is the number of its occurrences in the text divided by the number of all trigrams) , displayed in [0,1] by the formula -x(x+A)
The logarithm of the geometric mean of the conditional probabilities of trigrams. the conditional probability of a trigram is its probability divided by the probability of the bigram of the first two words
An analogue of the QueryDOwnerClicksPCTR factor, differs from it in that queries are normalized by doppelgangers (details of such normalization are at ((http://staff.yandex-team.ru/finder by Andrei Plakhov)), code -ysite/yandex/doppelgangers)
An analogue of QueryDOwnerClicksPCTR factor, differs from it in that queries are normalized by doppelgangers (details of such normalization are in ((http://staff.yandex-team.ru/finder Andrei Plakhov)), code -ysite/yandex/doppelgangers). Localized to relev_regions.web.txt
An analogue of the QueryUrlClicksPCTR factor, differs from it in that queries are normalized by doppelgangers (details of such normalization are at ((http://staff.yandex-team.ru/finder by Andrei Plakhov)), code - ysite/yandex/doppelgangers)
An analogue of QueryUrlClicksPCTR factor, differs from it in that queries are normalized by doppelgangers (details of such normalization are in ((http://staff.yandex-team.ru/finder Andrei Plakhov)), code - ysite/yandex/doppelgangers). Localized to relev_regions.web.txt
BM25 by URL
There is a big picture on the page
A MatrixNet formula is applied to all factors (TG_UNUSED - to prevent entering any formulas)
The difference between the current date and the date of the document defined by DaterAge, 1 - document date is current, 0 - document is 10 years old or more, If no date is defined, equals 0. Attention!((1 - DaterAge)*60)^2 = page age in days.
hard pessimization (aka PR=0), binary factor, counts in anti-spam
Host factors, determine link-stuffed sites - second and third inbound degrees ((http://wiki.yandex-team.ru/JandeksPoisk/KachestvoPoiska/ObshayaFormula/TekushhieKomponenty/antispam?v=181rh58953-4#cindegree12 more info))
Host factors, determine link-stuffed sites - second and third inbound degrees ((http://wiki.yandex-team.ru/JandeksPoisk/KachestvoPoiska/ObshayaFormula/TekushhieKomponenty/antispam?v=181rh58953-4#cindegree12 more info))
Number of incoming links without Russian letters. Remapsed.
Maximum number of forms for all query words - max for all query words number_form_for_word/64
Weighted by word weights, the sum of the number of forms is the sum over all words of the query number_form_for_word/64*word_weight; remap of the form x/(1 + x).
Unweighted sum of number of forms - sum over all query words number_form_for_word/64/number_words_query
Maximum number of forms for all query words
Weighted by word weights, the sum of the number of forms
Unweighted sum of the number of forms
Analogs of factors of the same name, word weight = 1
Analogs of factors of the same name, word weight = 1
Analogs of factors of the same name, word weight = 1
Analogs of factors of the same name, word weight = 1
Analogs of factors of the same name, word weight = 1
Analogs of factors of the same name, word weight = 1
Query segments are parts of a query that are themselves frequent queries. The factor shows how much the segments break in the text. value 0 - all words occur only within the designated segments, 1 -- all occurrences break segments
The value of the CMMatchTop5AvgMatch factor for the AliceMusic stream
Proportion of different parts of speech in the text. proportion of numerals (among all words in which we were able to recognize the part of speech)
particle fraction
proportion of pronoun adjectives
proportion of pronouns
verb proportion
the proportion of words that can be both masculine and feminine nouns, but not neuter, among all nouns (examples: 'hummingbird' is an example of indefinite gender, which can be defined in two ways, 'Alexandra' is a homonym).
Quality of incoming references (Leschiner's classifier) corrected
Whether or not LinkQuality was counted for this page (not counted if there are few links) corrected
Incoming link quality classifier 2 corrected

Is Org

#408
In the query, the name of the organization (example: Gazprom, Gazprom) ((http://wiki.yandex-team.ru/ArsenGadzhikurbanov/Wares Description))
CMMatchTop5AvgMatchValue factor value for the AliceMusic stream
The size of the largest text segment of the page (from the [18] PureText factor)
link relevance without regard to rare words
Number of different internal links per page
A city is defined for the site
Query factors - result ((http://wiki.yandex-team.ru/PoiskovajaPlatforma/Lingvistika/ZaprosnyjeFactory/LocalizovannyjeZaprosy query geolocalization classifier)) - new version of factors [328]-[330]: U - geo-relevant - regional query output is meaningless;
Query factors - result ((http://wiki.yandex-team.ru/PoiskovajaPlatforma/Lingvistika/ZaprosnyjeFactory/LocalizovannyjeZaprosy query geolocalization classifier)) - new version of factors [328]-[330]: R - georelevant - regional results in the output could be useful, but no more than that;
Query factors - result ((http://wiki.yandex-team.ru/PoiskovajaPlatforma/Lingvistika/ZaprosnyjeFactory/LocalizovannyjeZaprosy query geolocalization classifier)) - new version of factors [328]-[330]: V - geolocal - regional issuance is fundamental.
PerWordCMMaxPredictionMin factor value for the AliceMusic stream
Ukrainian Page rank
=1 - on Download formula. Class queries: download/view online/play/photo/listen
Query classifier result - the query has words from the appropriate dictionary. brand
medical dictionary
Question
a request specific to Moscow
organization
porn
travels
The popularity of video, comes from video
Frequency of links to the site
Number of almost-periodic references
The number of impressions by request, normalized x/(100 + x).
The number of impressions of the url on the request, normalized x/(100 + x).
LiveInternet counter
Popularity of the owner in queries
DSSM model with early binding, trained on reformulations, and pre-trained on ASR hypotheses of musical queries to Alice
Model trained on PRS-log pool on Bert's prediction trained on sinsig_ce with a threshold of 0.5, using a chain of regions to country
DSSM model with early binding, trained on reformulations and retrained on music requests to Alice
Eleven factors based on statistical properties of the distributions of incoming vertex degrees referring to a fixed vertex of the hostgraph.((http://wiki.yandex-team.ru/JandeksPoisk/KachestvoPoiska/ObshayaFormula/TekushhieKomponenty/hostdegree details))
The value of the pirate detector calculated in behemoth.
Yandex music canonized url type - album
Calculated as (10-x) where x is the return of the document in days (continuous) relative to the validity time of the document in the samovar
Document host recognized in query
The URL consists only of the host that is recognized in the request
URL is a Yandex news story
URL feature computed from rapid clicks spy_log counters with decay of 1 day
URL feature computed from rapid clicks spy_log counters with decay of 1 day
URL feature computed from rapid clicks spy_log counters with decay of 0.5 days
URL feature computed from rapid clicks spy_log counters with decay of 0.5 day
They are calculated as (80 - x) / 80, where x is the age of the document in hours. The factors make sense only for the quickbot base (the last 80 hours). They are not used in ranking. They are used in reranking.
They are calculated as (80 - x) / 80, where x is the age of the document in hours. The factors make sense only for the quickbot base (the last 80 hours). They are not used in ranking. They are used in reranking.

Swbm25

#452
The clever BM25 in a sliding window. The window size is set in sentences. Use "jokers" for titles and the beginning of the document. Morphological proximity and text structure are taken into account. The weight of the window fades with distance from the beginning of the document.
Factor about how good a snippet can get.
Simple BM25 by word pairs - take all pairs of query words and count the number of their occurrences in the text of the document. Weight =1. Comm Doesn't work if query has stop word
The logarithm of the number of shingles on which a given document is not unique
The logarithm of the number of shingles on which a given document owner is recognized as an author
Average weight of non-unique shingles of this document
Mascot feature MarketQualityRating
Medical host quality for new marks.
Medical host quality for new marks for experiments.
Finance or law host quality for new marks.
Finance or law host quality for new marks for experiments.
Finance or law host quality for new marks.
Finance or law host quality for new marks for experiments.
Factor for host in list of documentation cs hosts for experiments
It is calculated in the same way as HostRank factor, but not on the whole owner-graph, but on its subgraph consisting of owners of the given region. Region belonging is determined by TLD, or by presence in index of pages from given owner which geo or geoa classifier says that they are from given region. Mapped in the same way as the HostRank factor, to a number from 0 to 1 with 256 gradations
Document from the language section of wikipedia corresponding to the user region
The language of the document corresponds to the language of the request
Popularity of the request within the country
Degree of centralization of the points from which the request is made (within the country)
Geographical distribution of the request within the country
The hour in which this request is most frequently asked
The severity of querying at different times of the day (within the country)
The country of the document (domain) and the country of the user are the same ((http://wiki.yandex-team.ru/JandeksPoisk/KachestvoPoiska/ObshayaFormula/TekushhieKomponenty/OpisanijaFaktorov#nationaldomain details))
There's a porn ad on the page
URL feature computed from rapid clicks spy_log counters with decay of 3 days
Country localizability classifier - how much the query implies the country context
Number of slashes in the url
BM25 with different parameters for different fields, including incoming anchortext. The text weights of incoming links to the page are normalized according to the delta page rank of the link
Built-in video player on the page
Video for download
URL feature computed from rapid clicks spy_log counters with decay of 3 days
URL feature computed from rapid clicks spy_log counters with decay of 14 days
A service factor that was needed to search the site, and will still be needed in the future.
The factor is calculated from the text of the url using the quality/seq/gsk sequence classifier
Model with learning each trigram on '+' and '-' urls. It does not depend on the query.
URL feature computed from rapid clicks spy_log counters with decay of 14 days
Age of rapid clicks spy_log update, in seconds
Freshness of rapid clicks spy_log update
The size of the minimum chunk of text that includes all of the query words in the document. Not currently used. ((http://wiki.yandex-team.ru/JandeksPoisk/KachestvoPoiska/ObshayaFormula/TekushhieKomponenty/YMW more info))

Bclm

#493
Buettcher, Clarke, and Lushman Name Factor (modified) ((http://wiki.yandex-team.ru/JandeksPoisk/KachestvoPoiska/ObshayaFormula/TekushhieKomponenty/BCLm more info))
A measure of the 'commerciality' of a query. It is a complex factor calculated by MatrixNet using a formula based on the purchasing dictionary in directe-mail + logs of user requests + additional intent dictionaries. Requests with the intent to buy factor tends to ->1 product queries ->0.6 with the intent not to buy, reviews, etc. -> 0 ((http://wiki.yandex-team.ru/AntonNeljubin/FaktorydljaNovogoKlassifikatorazaprosov factors classifier))((http://wiki.yandex-team.ru/JandeksPoisk/Antispam/AntiSEO/KlassifikatorKommercheskixZaprosov more about him))
Unigram linguistic model. The language model is modeled by document, smoothed by the general language model. When building a model by document, information about what field of the document the query word occurred in (Title, head, or plain text) is used
Matching geography defined from document url and query city (ip or lr)
Matching geography defined from document url and query area (ip or lr)
Coincidence of geography, defined from document url and country of request (ip or lr). Relevant for Russia and Ukraine.
Match the geography defined from the document url and the city in the query (GeoCity rule)
The value of the forked commerce detector calculated in behemoth.
Calculates the query coverage by the alphabetic trigrams of the document header
Calculates the header coverage by the alphabetic trigrams of the document title
Probabilistic model based on the texts of incoming links
Counts the sum of occurrences of the following: a sequence of query words longer than two occurring in one sentence; normalized to the length of the document.
Counts the sum of occurrences of the following form: a sequence of query words longer than two, occurring in one link; normalized to the number of links.
Share of clicks on navigation requests
The result has a geo-reference that does not match the user's geography at the city level ([415]==1 && [215]==0)
Geovitability of the query for results from the user's region
Geovitability of the query for results not from the user's region
the percentage of URLs that respond without errors
Matching thematic spectrum (by DMOZ) of the query and the document. The subject of the query is determined by ((http://wiki.yandex-team.ru/JandeksPoisk/ZarubezhnyjjInternet/DMOZqueryClassifier1 DMOZTheme wizard rule)) Document subject is determined by the automatic classifier
Matching thematic spectrum (by DMOZ) of the query and the document. Subject of the query is determined by the best result ((http://wiki.yandex-team.ru/JandeksPoisk/ZarubezhnyjjInternet/DMOZqueryClassifier1 DMOZTheme wizard rules)) Document subject is determined by automatic classifier

Mpsa

#513
Estimates the minimum distance between pairs of query words, taking into account the distance of the pair from the beginning of the document (Minimal Pair Size with Attenuation). By pairs we mean all consecutive bigrams of query words. Thus, the number of pairs is equal to the number of words in the query reduced by 1. Accordingly, the factor makes sense for queries consisting of more than one word.((http://wiki.yandex-team.ru/JandeksPoisk/KachestvoPoiska/ObshayaFormula/TekushhieKomponenty/MPSA MPSA))

Bclm2

#514
It differs from BCLm in that the weights of all words are counted equally. ((http://wiki.yandex-team.ru/JandeksPoisk/KachestvoPoiska/ObshayaFormula/TekushhieKomponenty/BCLm2 BCLm2))
Text relevance based on the language model, taking into account the absolute position. We go through the text with a box of 20 words, build for each box a language model (that is, probability distribution on the words of the Russian language) and calculate the probability of generating a query. For the distance from the beginning of the document penalize the model.
Page region size
Freshness of rapid clicks spy_log update, calculated at the request time

Is Geo

#520
Releases to base searches under the isgeo name the maximum weight of the encountered geo-object in the query. By geo-object we understand an object of category Geo, Geo1, GeoAddr, GeoAddr1, LandMark, LandMark1 (see ((http://wiki.yandex-team.ru/AlekseySokirko/QueryObjects som's markup)).((http://wiki.yandex-team.ru/ArsenGadzhikurbanov/Wares Details))
Drops the maximum weight of the encountered object of category Music or Music1 in the query to the base searches under the name ismusic. (see ((http://wiki.yandex-team.ru/AlekseySokirko/QueryObjects som's markup)).((http://wiki.yandex-team.ru/ArsenGadzhikurbanov/Wares Details))
A modification of the Bclm2 factor, lightened for use in Fastranck. The main difference is that BclmLite does not use absolute word offsets relative to the beginning of the document. Instead the factor works with regular positions of the form <Number_offer, Position_in_offer>. The proximity between words is only taken into account within a sentence.((http://wiki.yandex-team.ru/JandeksPoisk/KachestvoPoiska/ObshayaFormula/TekushhieKomponenty/BCLmLite BCLmLite))
Results in the immediate vicinity ([pharmacies], [children's polyclinic]) are important when answering the query
When answering a query, the results within the city are important (the bulk of localizable queries)
When answering a query, the results from the user's area, region ([airport], [dairy]) are important
Number of incoming links from mordas
Corrected YmwFull. Only differs from previous version by behavior on 2-word queries. ((http://wiki.yandex-team.ru/JandeksPoisk/KachestvoPoiska/ObshayaFormula/TekushhieKomponenty/YMW more info))
Binary factor, every word of the query is in the text or in the links
uses 'country aux tree' (auxqc)
uses 'country aux tree' (auxqc)
Page - '404' (share of '404' tokens in relation to the total number of tokens on the page)
URL feature computed at the request time from rapid clicks spy_log counters with decay of 1 day
BM25, in which the weight of the word is machine-like
Factor evaluates how query words are grouped with each other in the text of the document without regard to their order. ((http://wiki.yandex-team.ru/SergejjKrylov/QueryWordCohesionTR description))
nd/k normalized time to click
URL feature computed at the request time from rapid clicks spy_log counters with decay of 0.5 days
selected formula
Number of letters in the Aux segment
Number of gaps in the Aux segment
Number of commas in the Content segment
The page is the store. ((http://wiki.yandex-team.ru/JandeksPoisk/KachestvoPoiska/ObshayaFormula/TekushhieKomponenty/OpisanijaFaktorov#isshop description)). Not used (deprecated)
The logarithm of the number of shingles in the document added by the site host as original texts in ((http://wiki.yandex-team.ru/JandeksPoisk/Jekosistema/MarketingPR/Webmasters/plan/vtorcontect Originality plugin)). It doesn't take part in the formula, it's needed for re-ranking of doubles
Average filtered number of sources of document authorship. Not included in the formula, needed for re-ranking of duplicates
((http://wiki.yandex-team.ru/JandeksPoisk/KachestvoPoiska/ObshayaFormula/TekushhieKomponenty/OpisanijaFaktorov#queryreftrigrams description))
((http://wiki.yandex-team.ru/JandeksPoisk/KachestvoPoiska/ObshayaFormula/TekushhieKomponenty/OpisanijaFaktorov#queryreftrigrams description))
IDF variance of query words if there are text hits in the document (mixed query-text factor)
UrlNGramsModel ranking factor in erf
The language of the document corresponds to the country of the request

Locm

#558
Word order in references.
The degree of diversity of queries clicked by this url is counted by region
Proportion of query segments present in the text
The language of the document is one of the allowed for Turkey (Turkish, English, German, French, Arabic, Azeri) or the document has zero length. At the search stage it is calculated only for IsRealGeoLocal queries.
A variation on the theme ((http://wiki.yandex-team.ru/JandeksPoisk/KachestvoPoiska/ObshayaFormula/TekushhieKomponenty/DBM25 DBM25)), see ysite/yandex/relevance/dbm25.cpp
Dispersion of document reference regions
Number of clicks on the owner and the number of clicks on the request more than 5
BM25FdPR with normalization to the average document length depending on the document language. ((http://wiki.yandex-team.ru/BM25FRework Test Results.))
The popularity of the document language. A number from 0 to 1. ((http://wiki.yandex-team.ru/JandeksPoisk/KachestvoPoiska/ObshayaFormula/TekushhieKomponenty/LanguagePopularity LanguagePopularity))
The sum of factors QueryDOwnerClicksFRC and BM25FdPRFixed with weights 0.358449 and 0.184922 respectively. The '565' in the factor name should not be taken literally, it is either a legacy or a typo.
The sum of factors 192 and 341 with weights of 0.298942 and 0.454625, respectively.
URL feature computed at the request time from rapid clicks spy_log counters with decay of 3 days
URL feature computed at the request time from rapid clicks spy_log counters with decay of 14 days

Tocm

#572
Factor evaluates the difference between the positions of words in the header and the positions of words in the query
Dispersion of languages in xmap
There is a typo in the query
A variation on the theme ((http://wiki.yandex-team.ru/JandeksPoisk/KachestvoPoiska/ObshayaFormula/TekushhieKomponenty/DBM25 DBM25)), see ysite/yandex/relevance/dbm25.cpp
The url is known to show up too often with very low relevance (by bert and/or by bm25)
The ratio of the number of incoming links whose text is a URL to the number of all incoming links
A pool of PRS logs is tagged using Bert trained on sinsig. The dssm model is trained on this pool, using BaseRegionChain
The number of 'notbooks' in the url
URL length to within a character. Disabled in production.

Is Hub

#582
Haboost of the page
Degree of commerciality of the page header. Not used (deprecated)
BM25 of the page title by its text
BM25 page title by the text of the links to it
Number of incoming seo-trash links between hosts
Static URL factor by search sessions for 1600 days calculated by mobile sessions. Average DwellTime, and DwellTime from session is truncated if more than 180 seconds
Static URL factor by search sessions for 1600 days calculated by mobile sessions. Probability that the click on the URL will be more than 120 seconds
Static URL factor by search sessions for 1600 days calculated by mobile sessions. The probability that the URL will not be clicked if at least one URL is clicked is lower.
Static URL factor by search sessions for 1600 days calculated by mobile sessions. Average DwellTime, and DwellTime from session is truncated if more than 3600 seconds. Localization to country level.
Static URL factor by search sessions for 1600 days calculated by mobile sessions. Average DwellTime, and DwellTime from session is truncated if more than 180 seconds. Localization to country level.
The value of the health detector calculated in behemoth.
OffersBase feature for ecoboost.
OffersBase feature for ecoboost.
OffersBase feature for ecoboost.
OffersBase feature for ecoboost.
Share of unique title trigrams in the link trigrams
Share of unique link trigrams in the title trigrams
The publicity of the page
Similar to YabarUrlVisits
The document URL corresponds to the user's region(s) ((http://wiki.yandex-team.ru/JandeksPoisk/KachestvoPoiska/geo/RegNavQueries /JandeksPoisk/KachestvoPoiska/geo/RegNavQueries))
The URL of the document corresponds to the user's city
Regional navigational query - there are one or more navigational results in the user's region
Number of sessions in which the url was the last, divided by the number of sessions in which the url appeared
The sum of the maximum SourceRank values for each incoming link, taking into account the uniqueness of the owner.
BM25 by texts and links with special scales by level of matching (form, lemma, synonym)
Weight of query words that are in the text in exact form
Weight of query words that are in the text with exact lemma
Weight of query words that are in the text

Is Hum

#610
Drops the maximum weight of the encountered object of category Hum or Hum1 in the query to base searches under the name ishum. (See ((http://wiki.yandex-team.ru/AlekseySokirko/QueryObjects soma markup)).((http://wiki.yandex-team.ru/ArsenGadzhikurbanov/Wares#ishum Details))
Drops the maximum weight of the encountered object of category Text or Text1 in the query to the base searches under the name istext. (See ((http://wiki.yandex-team.ru/AlekseySokirko/QueryObjects soma markup)).((http://wiki.yandex-team.ru/ArsenGadzhikurbanov/Wares#istext Details))
Drops the maximum weight of the encountered Picture or Picture1 category object in the query to the base searches under the name ispicture. (See ((http://wiki.yandex-team.ru/AlekseySokirko/QueryObjects som's markup)).((http://wiki.yandex-team.ru/ArsenGadzhikurbanov/Wares#ispicture Details))
Returns under the name wmaxone the maximum degree of naming of the encountered objects in the query. (see ((http://wiki.yandex-team.ru/AlekseySokirko/QueryObjects som's markup)).((http://wiki.yandex-team.ru/ArsenGadzhikurbanov/Wares#maxone More))
Returns, under wminone, the maximum degree of naming of the encountered objects in the query. (see ((http://wiki.yandex-team.ru/AlekseySokirko/QueryObjects som's markup)).((http://wiki.yandex-team.ru/ArsenGadzhikurbanov/Wares#minone More))
Bm25 by query index for domAttr
Bm25 by query index for domAttr
Bm25 by query index for domAttr
BCLM by query index for domAttr
BCLM by query index for owners
Allows you to assess whether a document is 'live' in terms of references to it coming in.
Maximum sum of query word weights in a window of 50 words
Similar to YabarUrlVisitors
Similar to YabarUrlAvgTime
The core audience of pages that have a Metrics counter
Share of clicks on this url among all clicks on similar requests
corrected CTR of this url for all similar queries
Clickability of the domain by bigrams (without taking into account thesaurus query extensions)
Visits to the site from search engines by bigrams, according to Bar (without taking into account thesaurus query extensions)
Clickability of the host for the last word of the query (without taking into account thesaurus query extensions)
OffersBase feature for ecoboost.
OffersBase feature for ecoboost.
Business kernel.
Business kernel.
Business kernel.
URL feature computed at the request time from rapid clicks search counters with decay of 1 day
A copy of the ((http://wiki.yandex-team.ru/JandeksPoisk/KachestvoPoiska/ObshayaFormula/TekushhieKomponenty/Locm LOCM)) factor for ((http://wiki.yandex-team.ru/JandeksPoisk/KachestvoPoiska/ObshayaFormula/TekushhieKomponenty/Synset synsets)).
Copy of LinkBM25 factor for ((http://wiki.yandex-team.ru/JandeksPoisk/KachestvoPoiska/ObshayaFormula/TekushhieKomponenty/Synset synsets)).
URL feature computed at the request time from rapid clicks search counters with decay of 30 days
The most probable topic of the query defined ((http://wiki.yandex-team.ru/JandeksPoisk/ZarubezhnyjjInternet/DMOZqueryClassifier1 DMOZTheme wizard rule)), only the most popular topics are taken into account (but there are more than in the DmozQueryThemes factor). The factor contains probability of matching of the query to the theme, but for each theme is taken a different interval on the interval [0...1].
Query topic defined ((http://wiki.yandex-team.ru/JandeksPoisk/ZarubezhnyjjInternet/DMOZqueryClassifier1 DMOZTheme sorcerer rule)), only a few of the most popular topics are taken into account.
0 or 1 depending on the presence of an explicit need_photo intent in the request from the variety
0 or 1 depending on whether the query has an explicit need_map intent of the variety
Factor is analogous to LongQuery (sum of query word idf), but with 'correct' synonyms. Specifically, the minimum of idf (i.e. the most frequent) of synonyms and words is selected.
The url contains a token that matches the short name of the user's country. The factor counts only on the EU thread.
Personalized Turkish PageRank
Expected number of searches on the query
Share of unique trigrams of the footer fragment in the link trigrams
Share of unique link trigrams among the fragment of footer trigrams
Binary logarithm of the query probability by the erratum service language model
Url is an offerer in the latest version of the marketplace base.
A variation on the theme ((http://wiki.yandex-team.ru/JandeksPoisk/KachestvoPoiska/ObshayaFormula/TekushhieKomponenty/DBM25 DBM25)), see ysite/yandex/relevance/dbm25.cpp
BM25 variation
BM25 variation
BM25 variation
'Fixed' clicks counted with RequestAggregateLib
'Fixed' clicks counted with RequestAggregateLib. Regional Version
Regional Attendance of the url according to me-bar data
Average time of user's stay on the host in case of external (from another nonsearch site) access from a particular URL
Average 'depth' (number of hits within a host) of user's stay on the host during external (from another nonsearching site) visits from a particular URL
DBM separately by number
DBM separately by geo-request objects
DBM separately for nouns
Average length of the logical session in which there was a request
Bclm (weighted) by lyrics from hops.
bounce rate

Bocm

#668
Assesses whether the positions of words in the document sentences match the positions of words in the query.
The churn rate of users from search after a visit to the site
The document contains the name from the request.
This is index.(html/php/aspx?/...), without cgi parameters. It counts for all doublers.
This is index.(html/php/aspx?/...), possibly with cgi parameters. It counts for all doublers.
Whether the host is its own owner, conditionally Host == Owner(Host).
Minimum PathAndQuery length over all half-doubles.
Regionalized version of XLerfGeoLRlogRelev factor (only links from the country of the request are taken)
Regionalized (only links from request country are taken) variant of XNonCommLerfNormLRlogRelev factor
Regionalized (only links from the country of the request are taken) variant of Locm factor
Regionalized (only links from the country of request are taken) variant of XLRrelev factor
Regionalized (only links from the country of request are taken) variant of XLerfLRrelev200 factor
((http://wiki.yandex-team.ru/JandeksPoisk/Antispam/polunavigacionnyezaprosy#faktornavigacionnostiparyurl-zapros classifier)) pairs vitals [query-url], url vitals for query if value on it >0.5
Classifier for commercial site evaluations
There is a direct link to the file on the document
The document has a link to filehosting
0 or 1 - whether the request matches the regulars from the ticket
0 or 1 - whether the request matches the regulars from the ticket
0 or 1 - whether the request matches the regulars from the ticket

Qr Tur

#687
Predicting the proportion of "good" (at least with two different cities and frequency>=10) mentions of the query with geography in Turkey
The result of the lexical query classifier that predicts the probability of a click on the 3561 subject page
The result of the lexical query classifier that predicts the probability of a click on the 3973 subject page
Query 'navigability' rank
Regional traffic from search engines for a particular query
Clicks on the urls shown in the output for queries that have gone to other search engines
Showing urls in the output for queries that have gone to other search engines
Classification of commerciality of the site
In the latest version of the base of the marketplace there are offers from this host.
The proximity of the query words to the hardest word.
The url satisfies the regexp-expression defined in the prone
The document contains user feedback/comments
Share of clicks on a given url among all clicks on similar queries, country version, see ((http://wiki.yandex-team.ru/Development/Poisk/arcadia/indexregex indexregex))
corrected CTR of this url for all similar queries, country version, see ((http://wiki.yandex-team.ru/Development/Poisk/arcadia/indexregex indexregex))

Found

#701
Average amount found by query
Angle in the Depth Nodes space, counted by words only (Min for all)
Classifier that approximates the quality of commercial sites based on user behavior data
Document creation time with month accuracy 1.0 -- current month, 0 -- 10 years ago and older. Temporarily disabled
Document update time with month accuracy 1.0 -- current month, 0 -- 10 years ago and older. Temporarily disabled
The year distribution likelihood function in the document. Temporarily disabled
Num of Sovetnik urls
The variance of the number of query words in the links.
The arithmetic average of date positions in the document. Temporarily disabled

Cabm

#714
BM with fading in the text of the catalog references.
Average url position by normalized query
Average domAttr position by normalized query
Average url position for all queries
Average host position for all queries
Number of requests per url
Number of requests per host
implementation of the algorithm described in the article ((http://wiki.yandex-team.ru//h.yandex.net/?http%3A%2F%2Fresearch.microsoft.com%2Fen-us%2Fpeople%2Ftyliu%2Ffp032-liu.pdf http://research.microsoft.com/en-us/people/tyliu/fp032-liu.pdf)) by major regions (TRUBK)
Proportion of document words from segments with score > 2.
Site quality rank used for Moscow commercial formula boosts
Factor is used in SelectionRank. TG_UNUSED: should not be included in formulas to avoid feedback
URL feature computed at the request time from rapid clicks search counters with decay of 3 days
Weight of the document according to the one-word dictionary of commercial vocabulary
Shows that the request is in Ukrainian
Average query commerciality
Number of queries in the group of frequency queries similar to the specified one
FRC group of frequency queries similar to the specified one, with averaging through the sum of clicks and impressions
FRC group of frequency queries similar to the specified one, with averaging through the sum of clicks and impressions, according to regional statistics
URL feature computed from rapid clicks search frozen counters with decay of 1 day
The relative popularity of the word-host pair, where word is the word in the title of the Wikipedia article and host is the host referenced in the article.
Relative clickability of countryId-word-host triplets according to Yandex searches.
Relative clickability of countryId-word-host triplets according to data from popular search engines according to Bara and SimilarGroup logs.
Share of clicks on this url among all clicks on similar queries, calculated by popular search engine
Petal length Depth Nodes, calculated for hosts
Angle variance in Nodes Time space, calculated for hosts
0.9-quantile of the lobe length in Nodes Time space, calculated for hosts
Average by the words of the query the probability of downloading a file from the host after a click.
The nastiness factor of content.
CTR by click data, request normalized by synset
Regional CTR by click data, query normalized by synset
Static trigrams intercection of url and queries by which users visited the url.
The result of the sorcerer's rule.
Weighted BM15 for a query by index document - a list of queries to which it has been navigated.
Probability of downloading from the host after the click (according to Bar logs).
Number of chains by request / (number of chains in which the url participated + number of chains by request).
The number of chains in which the url was last, normalized to the total number of chains in which the url was.
Number of hits to the Wikipedia url
URL feature computed from rapid clicks search frozen counters with decay of 30 days
Indicator of the page as a hub (how many pages Bar users go to from it).
It counts TextBM25 in the title by the text of the user's region name - similar to factor 268.

Bclmf

#771
BCLM for Annotation index, doc text and links.
Dssm probability prediction by url + title that there are no products on the page.
FRC of a popular search engine by browser logs
Weighted mean of log(query_clicks)/log(query_shows) for given host. Weights are proportional to log(query_shows) + 0.2.
The number of hits on the url occurring in the chain of hops, normalized to the total number of hits on the request.
The probability of the url being the last on the request in the chain of hops.
Dssm probability prediction by url + title that there is one product on the page.
Dssm probability prediction by url + title that there are many products on the page.
URL feature computed from rapid clicks search frozen counters with decay of 3 days
The geo-referencing of the city level is defined for the url according to BUKI-1125 rules
Country level geo-referencing is defined for the url according to BUKI-1125 rules
Factor GeoRelevRegionCity by geoa attribute
Factor GeoRelevRegionRegion by attribute geoa
GeoGeometryProxim factor by geoa attribute
GeoRelevAlienCity factor by geoa attribute
GeoVQueryInUserCity factor by geoa attribute
GeoVQueryInAlienCity factor by geoa attribute
Factor PageRegionSize by attribute geo
PageRegionCoverage factor by geo attribute
The PageRegionCoverage factor by the adresa attribute
Factor GeoRelevRegionCity by attribute adresa
What fraction (on average per session) of clicked on this query url is this url. Calculated by user sessions.
Ovner is a store
Ovner is a service
Bclm (plane) by the texts from the hops.
FRC on transitions from queries that were set by the user several times
Average weight of impressions on the first page; a click weighs 1, a non-click weighs 1 according to the SBM_GAMMAS table
Average weight of impressions on the first page; click weights 1, non-click weights 1 according to the SBM_GAMMAS table. Regional version
the half sum of the evaluation of the url position with the median position for all similar queries by bist
Host feature computed at the request time from rapid clicks spy_log counters with decay of 3 days
Host feature computed at the request time from rapid clicks spy_log counters with decay of 3 days
Host feature computed at the request time from rapid clicks spy_log counters with decay of 14 days
Host feature computed at the request time from rapid clicks spy_log counters with decay of 14 days
Host feature computed at the request time from rapid clicks spy_log counters with decay of 3 days
Host feature computed at the request time from rapid clicks spy_log counters with decay of 14 days
Host feature computed from rapid clicks spy_log counters with decay of 3 days
Host feature computed from rapid clicks spy_log counters with decay of 3 days
Host feature computed from rapid clicks spy_log counters with decay of 14 days
Host feature computed from rapid clicks spy_log counters with decay of 14 days
Host feature computed from rapid clicks spy_log counters with decay of 3 days
Host feature computed from rapid clicks spy_log counters with decay of 14 days
Finetuned reformulations DSSM to commercial clicked bargain odd-like target from visit log
Is video distributor legal
Average value of feature OneProductProbability
Average value of feature ManyProductsProbability
Average value of feature PayDetectorPredict
Ovner is a partner
The document is ShopInShop
The value of the conversion rate of the query calculated in behemoth.
Factor by name from the original query Computed from the contents of the document. Algorithm: Chain0Wcm.
At least one of the offers from the distributed scheme has a status of availability.
There is not a single offerer in the unraveled scheme.
For the url from ytier it is known that it has low quality content
For the url from ytier we know that its content is of acceptable quality
For the url from ytier it is known that he has good quality content
For the url from ytier it is known that he has excellent quality content
On the host there is a purchase on the EUOM.
There is a VISIT LOG purchase on the host.
The URL is a product on the Marketplace.
The URL is a product on the Marketplace and has an offerid.
The URL is ShopInShopCPA.
At least one of the offers from the distributed scheme has the status of unavailability.
There is a purchase on the EUOM.
There is a VISIT LOG purchase on the owner.
Dssm probability prediction by url + title that the document is a sponger.
The PartnerOfferContent available field in the new parser.
In the offerer from the new parser the field PartnerOfferContent available == true.
Normalized corrected clicks count by query with user's city(gc=) mentioned
Normalized corrected clicks maximum ratio by query with user's city(gc=) mentioned
Normalized corrected clicks maximum ratio by query with not user's city(gc=) mentioned
The value of PurchaseTotalPredict calculated in the behemoth.
The value of SerpSummarySurplusPredict calculated in behemoth.
User retrievability at the url
The value of RequestWith120D3ClickPartPredict calculated in behemoth.
Value of the query detector of the spongers calculated in the behemoth.
Logarithm of the average time a user was on a host with localization by country; calculated from Yabar logs
Ratio of dwell time on a host in a given region to dwell time on a host in all regions
Ratio of dwell time on the page in the given region to the dwell time on the page for all regions
The more users add to bookmarks a url, the more factor value it has
Predicting sos.dssm model by url + title.
Predicting med.dssm model by url + title.
Predicting fin_law.dssm model by url + title.
This url has a link from Infoboxes on Wikipedia.
Predicting cruelty.dssm model by url + title.
The value of HalfEcomPredict, calculated in behemoth.
A factor similar to RegexMaxClickPercentReg, but calculated by preffix-suffix generalization.
A factor similar to RegexMaxClickPercentYabarReg, but calculated by preffix-suffix generalization.
A request-document model of navigability.
Average slope angle in the vertex-hanging plane
QueryUrl factor. Value - result of collaborative data filtering for the QueryUrlCorrectedCtr factor
The value of the MatrixNet slow ranking model.
The value of the MatrixNet fast ranking model.
The value of the MatrixNet filter model.
Factor in the text of the query and the title of the document, assessing the correspondence of the numeric ranges at the marker words
The Polynom value of the slow ranking model.
The value of Polynom fast ranking model.
The value of the Polynom filter ranking model.
An indication that the document was received by machine translation
Predicting med_with_trash.dssm (med. doc. model with lerne trash infusion) model by url + title.
Predicting fin_law_with_trash.dssm (fin_law_with_trash.dssm) model by url + title.
Factor by name from the original query It is counted by the content of the document. Minimum window size, which includes all the words of the query. Normalized by the number of words in the query.
Factor by name from the original query Document text. CosineMatchMaxPrediction algorithm.
Factor by all names from the original query Aggregation by all extensions. Aggregation type by extension: largest factor value; Computed by document content. Algorithm: Chain0Wcm.
Factor by all names from the original query Aggregation by all extensions. Type of aggregation by extensions: the largest factor value; Computed by document content. Minimum window size that includes all query words. Normalized by the number of words in the query.
Share of the url in the total number of clicks per session on the request (synnorm).
The average share of clicks on this url for this query among all clicks on this query (synnorm) during the day.
The average share of clicks on this url for this query among all clicks on this query (qnorm) during the day.
QI version of factor 861. MaxValue over the set of popular similar queries.
QI version of factor 798. MaxValue over the set of popular similar queries.
Factor by all names from the original query Aggregation by all extensions. Type of aggregation by extensions: largest factor value; Document text. CosineMatchMaxPrediction algorithm.
Dssm, predicting page quality score for a document
Query-url factor. The value is the result of collaborative data filtering for the SamplePeriodDayFrc factor
The value of the MatrixNet fast filter model.
The value of Polynom fast filter ranking model.
QI version of factor 879.
The value of MatrixNet on the meta.
Meaning of Polynom on the Mete.
A document is a short video (ticktock, reels, shorts).
The document is a telegram channel in web format.
The document is a post in a telegram.
CorrectedCtrReg factor in the annotation index, AnnotationMatchPrediction factor
CorrectedCtrReg factor in the annotation index, QueryMatchPrediction factor
CorrectedCtrReg factor in the annotation index, ValueWcmAvg factor
CorrectedCtrReg factor in the annotation index, factor Bm15V4K5
Factor about presence of '?' symbol in url. Equals zero if url has cgi parameters (more precisely: all duplicates have '?' symbol in url).
DSSM click prediction from Alice-specific data
Factor by phone attributes tel_full from the original query Text document. Bocm15 word weight aggregation algorithm. The normalization coefficient is 0.01.
SamplePeriodDayFrc factor in the annotation index, QueryMatchPrediction factor
SamplePeriodDayFrc factor in the annotation index, AnnotationMatchPrediction factor
OneClick factor in the annotation index, QueryMatchPrediction factor
OneClick factor in the annotation index, AnnotationMatchPrediction factor
OneClick factor in the annotation index, factor Bm15AK4
OneClick factor in the annotation index, factor BocmWeightedW1K3
LongClick factor in the annotation index, QueryMatchPrediction factor
LongClick factor in the annotation index, AnnotationMatchPrediction factor
LongClick factor in the annotation index, factor Bm15AK4
LongClick factor in the annotation index, factor BocmWeightedW1K3
SplitDwellTime factor in the annotation index, QueryMatchPrediction factor
SplitDwellTime factor in the annotation index, AnnotationMatchPrediction factor
BQPR factor in the annotation index, QueryMatchPrediction factor
BQPR factor in the annotation index, AnnotationMatchPrediction factor
YabarVisits factor in the annotation index, QueryMatchPrediction factor
YabarVisits factor in the annotation index, AnnotationMatchPrediction factor
YabarTime factor in the annotation index, QueryMatchPrediction factor
YabarTime factor in the annotation index, AnnotationMatchPrediction factor
SimpleClick factor in the annotation index, QueryMatchPrediction factor
SimpleClick factor in the annotation index, AnnotationMatchPrediction factor
LongClick factor in the annotation index, BocmPlain factor
Collaborative filtering result for factor FI_DBM35 from random log in annotation index, FullMatchPrediction factor
Collaborative filtering result for factor FI_DBM35 from random log in the annotation index, factor AnnotationMatchPrediction
OneClick factor in the annotation index, SynonymMatchPrediction factor
OneClick factor in the annotation index, FullMatchPrediction factor
OneClick factor in the annotation index, ValueWcmAvg factor
OneClick factor in the annotation index, BocmWeightedMaxK1 factor
OneClick factor in the annotation index, factor Bm15StrictK2
OneClick factor in the annotation index, factor Bm15MaxK3
OneClick factor in the annotation index, factor BclmPlainW1K3
OneClick factor in the annotation index, ValueWcmMax factor
OneClick factor in the annotation index, ValueWcmPrediction factor
OneClick factor in the annotation index, BclmWeightedK3 factor
BQPR factor in the annotation index, factor BocmWeightedW1K3
BQPR factor in the annotation index, factor Bm15StrictK2
SplitDwellTime factor in the annotation index, factor BocmWeightedMaxK1
SplitDwellTime factor in the annotation index, FullMatchPrediction factor
SplitDwellTime factor in the annotation index, ValueWcmAvg factor
CorrectedCtrReg factor in the annotation index, factor Bm15StrictK2
Predicting the proportion of queries with geography by the bag of words built for a query
The query is a url to the exact point and space characters - the wizard's isurl rule is used
Collaborative filtering result for factor FI_DBM35 from random log in annotation index, factor ValueWcmMax
Collaborative filtering result for factor FI_DBM35 from random log in the annotation index, factor ValueWcmAvg
Collaborative filtering result for factor FI_DBM35 from random log in annotation index, factor Bm15StrictK2
Collaborative filtering result for factor FI_DBM35 from random log in the annotation index, factor BclmPlainW1K3
Collaborative filtering result for factor FI_DBM35 from random log in annotation index, factor BclmWeightedK3
Collaborative filtering result for factor FI_DBM35 from random log in annotation index, factor BocmWeightedW1K3
CorrectedCtrXfactor in the annotation index, AnnotationMatchPrediction factor
CorrectedCtrXfactor in the annotation index, QueryMatchPrediction factor
CorrectedCtrXfactor in the annotation index, ValueWcmMax factor
CorrectedCtrXfactor in the annotation index, ValueWcmAvg factor
CorrectedCtrXfactor in the annotation index, BocmWeightedW1K3 factor
CorrectedCtrXfactor in the annotation index, factor BclmPlainK3
CorrectedCtrXfactor in the annotation index, factor BclmMixPlainW1K1
Predicting the total timestamp to the end of the session if this request-document pair is implemented
Predicting the contribution of this query-document pair to the timespan
SamplePeriodDayFrc factor in the annotation index, ValueWcmAvg factor
SamplePeriodDayFrc factor in the annotation index, factor Bm15MaxK3
SamplePeriodDayFrc factor in the annotation index, factor BocmWeightedK3
SamplePeriodDayFrc factor in the annotation index, factor BocmDoubleK5
SplitDwellTime factor in the annotation index, factor Bm15MaxK3
SimpleClick factor in the annotation index, factor BclmWeightedK3
Predicting the percentage of track length that will be played if this request-track pair is implemented
The probability that the region predicted by the yweb/robot/urlgeo_ml model is correct, assuming the predicted city
PopularSEFRCBrowser factor in the annotation index, AnnotationMatchPrediction factor
PopularSEFRCBrowser factor in the annotation index, SynonymMatchPrediction factor
PopularSEFRCBrowser factor in annotation index, ValueWcmPrediction factor
PopularSEFRCBrowser factor in the annotation index, factor BclmWeightedV2K3
PopularSEFRCBrowser factor in the annotation index, factor BclmMixPlainW1K1
Calculated by the link index. Max(sum(idf)) over all links that are subsets of query / sum(idf) for query
OneClick factor in the annotation index, AnnotationMatchPredictionWeighted factor
LongClick factor in the annotation index, AnnotationMatchPredictionWeighted factor
YabarTime factor in the annotation index, AnnotationMatchPredictionWeighted factor
Equals one if the page connects the js-api of any geo-data provider
LongClickSamplePeriod factor in the annotation index, AnnotationMatchPrediction factor
LongClickSamplePeriod factor in the annotation index, QueryMatchPrediction factor
LongClickSamplePeriod factor in the annotation index, ValueWcmAvg factor
LongClickSamplePeriod factor in the annotation index, ValueWcmPrediction factor
LongClickSamplePeriod factor in the annotation index, factor BclmPlainW1K3
LongClickSamplePeriod factor in the annotation index, factor BclmWeightedK3
LongClickSamplePeriod factor in the annotation index, factor BocmWeightedW1K3
LongClickSamplePeriod factor in the annotation index, factor BclmPlainK5
LongClickSamplePeriod factor in the annotation index, factor BclmWeightedV2K3
LongClickSamplePeriod factor in the annotation index, factor BocmDoubleK5
LongClickSamplePeriod factor in the annotation index, factor Bm15StrictK2
Normalized corrected clicks maximum ratio by query with user's city(gc=) mentioned equal by region
Normalized corrected clicks maximum ratio by query with user's city(gc=) mentioned equal to user's region
BQPR on the sampled period. Annotation Index. Factor WcmCoverageMax
BQPR on the sampled period. Annotation Index. FullMatchPrediction Factor
BQPR on the sampled period. Annotation Index. Factor AnnotationMatchPredictionWeighted
BQPR on the sampled period. Abstract Index. Factor ValuePcmAvg
BQPR on the sampled period. Abstract Index. Factor ValueWcmAvg
BQPR on the sampled period. Abstract Index. Factor Bm15V4K8
BQPR on the sampled period. Abstract Index. Factor BocmWeightedV4K8
BQPR on the sampled period. Annotation Index. SampleWcmMax factor
BQPR on the sampled period. Annotation Index. SynonymMatchPrediction factor
BQPR on the sampled period. Annotation Index. AnnotationMatchPrediction factor
BQPR on the sampled period. Annotation Index. Factor SuffixMatchCount
BQPR on the sampled period. Annotation Index. Factor WcmCoveragePrediction
DoubleFrc in the annotation index, FullMatchPrediction factor
DoubleFrc in the annotation index, SynonymMatchPrediction factor
DoubleFrc in the annotation index, AnnotationMatchPrediction factor
DoubleFrc in the annotation index, AnnotationMatchPredictionWeighted factor
DoubleFrc in the annotation index, QueryMatchPrediction factor
DoubleFrc in the annotation index, ValueWcmAvg factor
DoubleFrc in the annotation index, factor BocmWeightedMaxK1
DoubleFrc in the annotation index, factor Bm15V4K5
DoubleFrc in the annotation index, factor BocmWeightedV4K5
DoubleFrc in the annotation index, factor BocmDoubleK1
Linguistic Boosting Factor. Extension type: XfDtShow. Factor: minimum extension weight.
Linguistic Boosting Factor. Type of extensions: XfDtShow. Factor: Bm15 by stream group 2. Maximum value of the factor by extensions.
Linguistic Boosting Factor. Type of extensions: XfDtShow. Factor: BclmWeightedFLogW0 by stream group 3. Maximum value of the factor by extensions.
Linguistic Boosting Factor. Extension type: XfDtShow. Factor: Bm15FLogW0 by url and title. Maximal value of the factor by extensions.
Linguistic Boosting Factor. Extension type: XfDtShow. Factor: CosineMaxMatchPrediction by text and title. Maximal value of the factor by extensions.
Linguistic Boosting Factor. Extension type: XfDtShow. Factor: Bm15 by url. Maximal value of the factor by extensions.
Linguistic Boosting Factor. Extension type: XfDtShow. Factor: FullMatchValue by stream LongClickSP. Maximum weighted value of the factor by extensions.
Linguistic Boosting Factor. Extension type: XfDtShow. Factor: FullMatchValue by OneClick stream. Maximum weighted value of the factor by extensions.
Linguistic Boosting Factor. Type of extensions: XfDtShow. Factor: Bm15FLog by stream group 1. Weighted average of the factor values multiplied by the weight (\\frac{\\Sum W_i * (W_i * F_i)}{\\Sum W_i}) by extensions.
Linguistic Boosting Factor. Extension type: XfDtShow. Factor: Bm15FLogW0 by url and title. Weighted average of factor values multiplied by weight (\\frac{\\\Sum W_i * (W_i * F_i)}{\\Sum W_i}) by extensions.
Linguistic Boosting Factor. Extension type: XfDtShow. Factor: MinWindowSize by text. Weighted average of the factor values by extensions.
Linguistic Boosting Factor. Extension type: XfDtShow. Factor: mesh OriginalRequestFractionExact by streamer group for mesh factors (text, title, annotation streamers).
Linguistic Boosting Factor. Extension type: XfDtShow. Factor: Bagging CosineMaxMatchPrediction by Streaming LongClickSP.
Linguistic Boosting Factor. Extension type: XfDtShow. Factor: bag CosineMatchWeightedValue by Stream LongClickSP.
Linguistic Boosting Factor. Extension type: XfDtShow. Factor: Bag AnnotationMatchAvgValue by Stream SimpleClick.
Linguistic Boosting Factor. Extension type: XfDtShow. Factor: bag CosineMaxMatcg by Title.
Linguistic Boosting Factor. Type of extensions: XfDtShow. Factor: BclmWeightedFLogW0 by stream group 3. Minimum weighted value of the factor by extension top.
Linguistic Boosting Factor. Extension type: XfDtShow. Factor: AnnotationMatchWeightedValue by stream LongClickSP. Minimum weighted value of the factor by extension top.
Linguistic Boosting Factor. Extension type: XfDtShow. Factor: AnnotationMatchWeightedValue by stream LongClickSP. Minimum weighted value of the factor on the extension top normalized to the maximum weight on the extension top.
Linguistic Boosting Factor. Extension type: XfDtShow. Factor: Chain0WCM by text. Weighted average of the factor values multiplied by the weight (\\frac{\\Sum W_i * (W_i * F_i)}{\\Sum W_i}) of the extension top.
Linguistic Boosting Factor. Extension type: XfDtShow. Factor: FullMatchValue by stream LongClickSP. Weighted average of the factor values multiplied by the weight (\\frac{\\Sum W_i * (W_i * F_i)}{\\Sum W_i}) of the extension top.
Linguistic Boosting Factor. Extension type: XfDtShow. Factor: FullMatchValue by Stream OneClick. Weighted average of the factor values multiplied by the weight (\\frac{\\Sum W_i * (W_i * F_i)}{\\Sum W_i}) of the extension top.
Linguistic Boosting Factor. Type of extensions: XfDtShow. Factor: BclmWeightedFLogW0 by stream group 3. Weighted average of the factor values by the extension top.
OneClickFrc counted by the sampled period and collaboratively extended, FullMatchPrediction factor
OneClickFrc counted by the sampled period and collaboratively extended, AnnotationMatchPredictionWeighted factor
OneClickFrc calculated from the sampled period and collaboratively extended, ValueWcmAvg factor
OneClickFrc calculated from the sampled period and collaboratively extended, WcmMax factor
OneClickFrc calculated from the sampled period and collaboratively extended, WcmCoveragePrediction factor
OneClickFrc calculated from the sampled period and collaboratively extended, WcmCoverageMax factor
OneClickFrc calculated from the sampled period and collaboratively extended, PcmMax factor
OneClickFrc counted by the sampled period and collaboratively extended, PrefixMatchCount factor
OneClickFrc counted by the sampled period and collaboratively extended, SuffixMatchCount factor
OneClickFrc calculated from the sampled period and collaboratively extended, factor Bm15V0W1K1
The meaning of the locality classifier for the query
relev_local == ru
relev_local == ua
relev_local == by
relev_local == kz
local_report == tr
relev_locale == world
Porn query classification result from Wizard (iad_vw flag, based on Vowpal Wabbit)
Covering URLs with trigrams from query. Analogue of UrlDomainFraction,UrlPathAndParamsFraction factors.
QueryDwellTime, FullMatchPrediction factor
QueryDwellTime, SynonymMatchPrediction factor
QueryDwellTime, factor AnnotationMatchPrediction
QueryDwellTime, фактор AnnotationMatchPredictionWeighted
QueryDwellTime, QueryMatchPrediction factor
QueryDwellTime, ValueWcmAvg factor
QueryDwellTime, factor BclmPlainW1K3
QueryDwellTime, factor Bm15CoverageV4K3
QueryDwellTime, factor BclmPlainK4
QueryDwellTime, factor BocmWeightedV4K5
Percentage of visits, for which the dwell time during the day on the host is more than 90 sec.
Percentage of visits, for which the time during the day on the host is more than 160 sec
Rank of hacked sites
Morning ags4
Maximum QsRank on the owner
Average QsRank on the main domain
Percentage of users returning within a month
Number of users who returned during the month
Dorway Rank
Share of capital letters in Title
Share of incoming traffic from search engines among all incoming traffic
Share of direct visits among all incoming traffic
Average QsRank in the sliding window
Minimum QsRank
Medium Hops
Bm15K01 factor over hits from Url
Bm15K01 factor over hits from Title
Bocm15K001 factor over hits from Title
Bm11Norm16384 factor over hits from Text
Bocm11Norm256 factor over hits from Text
CosineMatchMaxPrediction factor over hits from Text
Bm15FLogK0001 factor over hits from FieldSet1 stream
Bm15FLogK0001 factor over hits from FieldSet2 stream
BclmWeightedFLogW0K0001 factor over hits from FieldSet3 stream
Bm15FLogW0K00001 factor over hits from FieldSetUT stream
Chain0Wcm factor over hits from Body
PairMinProximity factor over hits from Body
MinWindowSize factor over hits from Body
CosineMatchMaxPrediction factor over hits from PopularSeFrcBrowser stream
MixMatchWeightedValue factor over hits from DoubleFrc stream
AnnotationMaxValueWeighted factor over hits from DoubleFrc stream
AnnotationMaxValue factor over hits from DoubleFrc stream
AnnotationMatchWeightedValue factor over hits from DoubleFrc stream
AllWcmWeightedValue factor over hits from DoubleFrc stream
AllWcmMatch95AvgValue factor over hits from DoubleFrc stream
AllWcmWeightedPrediction factor over hits from DoubleFrc stream
AllWcmMatch80AvgValue factor over hits from DoubleFrc stream
FullMatchValue factor over hits from DoubleFrc stream
FullMatchAnyValue factor over hits from DoubleFrc stream
ExactQueryMatchAvgValue factor over hits from DoubleFrc stream
BclmMixPlainKE5 factor over hits from OneClickFrcXfSp stream
Bm15StrictAnnotationK01 factor over hits from OneClickFrcXfSp stream
AllWcmWeightedValue factor over hits from OneClickFrcXfSp stream
AllWcmWeightedPrediction factor over hits from OneClickFrcXfSp stream
AllWcmMatch80AvgValue factor over hits from OneClickFrcXfSp stream
MixMatchWeightedValue factor over hits from OneClickFrcXfSp stream
AnnotationMatchWeightedValue factor over hits from OneClickFrcXfSp stream
BclmPlaneProximity1Bm15W0Size1K0001 factor over hits from OneClickFrcXfSp stream
BclmWeightedProximity1Bm15Size1K001 factor over hits from OneClickFrcXfSp stream
BclmMixPlainKE5 factor over hits from BQPRSample stream
AllWcmWeightedValue factor over hits from BQPRSample stream
AllWcmWeightedPrediction factor over hits from BQPRSample stream
AllWcmMaxPrediction factor over hits from BQPRSample stream
AllWcmMatch80AvgValue factor over hits from BQPRSample stream
MixMatchWeightedValue factor over hits from BQPRSample stream
CosineMatchMaxPrediction factor over hits from BQPRSample stream
AnnotationMaxValueWeighted factor over hits from BQPRSample stream
AnnotationMaxValue factor over hits from BQPRSample stream
AnnotationMatchWeightedValue factor over hits from BQPRSample stream
Bocm15K001 factor over hits from BQPRSample stream
BclmPlaneProximity1Bm15W0Size1K0001 factor over hits from BQPRSample stream
BclmWeightedProximity1Bm15Size1K001 factor over hits from BQPRSample stream
BclmPlaneProximity1Bm15W0Size1K0001 factor over hits from LongClickSP stream
Bm15MaxAnnotationK001 factor over hits from LongClickSP stream
FullMatchValue factor over hits from LongClickSP stream
MixMatchWeightedValue factor over hits from LongClickSP stream
CosineMatchMaxPrediction factor over hits from LongClickSP stream
AnnotationMaxValue factor over hits from LongClickSP stream
AnnotationMaxValueWeighted factor over hits from LongClickSP stream
AnnotationMatchWeightedValue factor over hits from LongClickSP stream
AllWcmMatch95AvgValue factor over hits from LongClickSP stream
AllWcmWeightedValue factor over hits from LongClickSP stream
AllWcmMaxMatch factor over hits from LongClickSP stream
AllWcmWeightedPrediction factor over hits from LongClickSP stream
Bocm15K001 factor over hits from LongClickSP stream
QueryPrefixMatchOriginalWordValue factor over hits from LongClickSP stream
BclmPlaneProximity1Bm15W0Size1K0001 factor over hits from SamplePeriodDayFrc stream
AttenV1Bm15K05 factor over hits from SamplePeriodDayFrc stream
FullMatchValue factor over hits from SamplePeriodDayFrc stream
FullMatchAnyValue factor over hits from SamplePeriodDayFrc stream
AllWcmWeightedValue factor over hits from SamplePeriodDayFrc stream
AllWcmWeightedPrediction factor over hits from SamplePeriodDayFrc stream
AllWcmMatch95AvgValue factor over hits from SamplePeriodDayFrc stream
AllWcmMatch80AvgValue factor over hits from SamplePeriodDayFrc stream
MixMatchWeightedValue factor over hits from SamplePeriodDayFrc stream
AnnotationMatchWeightedValue factor over hits from SamplePeriodDayFrc stream
AnnotationMaxValue factor over hits from SamplePeriodDayFrc stream
AnnotationMaxValueWeighted factor over hits from SamplePeriodDayFrc stream
Bocm15K001 factor over hits from SamplePeriodDayFrc stream
AllWcmWeightedValue factor over hits from CorrectedCtrXFactor stream
AllWcmMaxPrediction factor over hits from CorrectedCtrXFactor stream
AllWcmWeightedPrediction factor over hits from CorrectedCtrXFactor stream
AllWcmMatch80AvgValue factor over hits from CorrectedCtrXFactor stream
MixMatchWeightedValue factor over hits from CorrectedCtrXFactor stream
AnnotationMatchWeightedValue factor over hits from CorrectedCtrXFactor stream
BclmPlaneProximity1Bm15W0Size1K001 factor over hits from CorrectedCtrXFactor stream
BclmWeightedProximity1Bm15Size1K001 factor over hits from CorrectedCtrXFactor stream
AllWcmMaxPrediction factor over hits from LongClick stream
MixMatchWeightedValue factor over hits from LongClick stream
AnnotationMaxValueWeighted factor over hits from LongClick stream
FullMatchValue factor over hits from LongClick stream
AnnotationMatchWeightedValue factor over hits from LongClick stream
AllWcmWeightedValue factor over hits from SimpleClick stream
AllWcmWeightedPrediction factor over hits from SimpleClick stream
AllWcmMaxPrediction factor over hits from SimpleClick stream
MixMatchWeightedValue factor over hits from SimpleClick stream
AnnotationMatchWeightedValue factor over hits from SimpleClick stream
AnnotationMaxValueWeighted factor over hits from BrowserPageRank stream
AnnotationMatchWeightedValue factor over hits from BrowserPageRank stream
AnnotationMaxValue factor over hits from BrowserPageRank stream
Bocm15K001 factor over hits from BrowserPageRank stream
MixMatchWeightedValue factor over hits from OneClick stream
FullMatchValue factor over hits from OneClick stream
AnnotationMatchWeightedValue factor over hits from OneClick stream
AllWcmWeightedPrediction factor over hits from SplitDwellTime stream
Bm15MaxAnnotationK001 factor over hits from SplitDwellTime stream
BclmWeightedProximity1Bm15Size1K0001 factor over hits from QueryDwellTime stream
AttenV1Bm15K001 factor over hits from QueryDwellTime stream
MixMatchWeightedValue factor over hits from QueryDwellTime stream
AnnotationMaxValueWeighted factor over hits from QueryDwellTime stream
AnnotationMaxValue factor over hits from QueryDwellTime stream
AnnotationMatchWeightedValue factor over hits from QueryDwellTime stream
AllWcmWeightedValue factor over hits from QueryDwellTime stream
AllWcmMatch80AvgValue factor over hits from QueryDwellTime stream
BclmPlaneProximity1Bm15W0Size1K0001 factor over hits from RandomLogDBM35 stream
Bm15StrictAnnotationK001 factor over hits from RandomLogDBM35 stream
MixMatchWeightedValue factor over hits from RandomLogDBM35 stream
AnnotationMaxValueWeighted factor over hits from RandomLogDBM35 stream
AnnotationMatchWeightedValue factor over hits from RandomLogDBM35 stream
AllWcmWeightedValue factor over hits from RandomLogDBM35 stream
FullMatchValue factor over hits from RandomLogDBM35 stream
ExactQueryMatchAvgValue factor over hits from RandomLogDBM35 stream
relev_local == id
Binary factor about the mobile adaptability of the document. Taken from erf
In cases where FI_NATIONAL_DOMAIN is 0 and herf.NationalDomainId is full, put 1
Linguistic Boosting Factor. Extension type: XfDtShow. Factor: MixMatchWeightedValue by stream QueryDwellTime. Maximum weighted value of the factor by extensions.
Linguistic Boosting Factor. Extension type: XfDtShow. Factor: MixMatchWeightedValue by stream QueryDwellTime. Weighted average value of the factor by extensions.
Linguistic Boosting Factor. Extension type: XfDtShow. Factor: MixMatchWeightedValue by stream QueryDwellTime. Minimum weighted value of the factor by extension top.
Linguistic Boosting Factor. Extension type: XfDtShow. Factor: AnnotationMatchWeightedValue by stream QueryDwellTime. Minimum weighted value of the factor by extension top.
Linguistic Boosting Factor. Extension type: XfDtShow. Factor: AnnotationMatchWeightedValue by stream QueryDwellTime. Maximum weighted value of the factor by extensions.
Linguistic Boosting Factor. Extension type: XfDtShow. Factor: AllWcmMatch95AvgValue by stream QueryDwellTime. Weighted average value of the factor by extension top.
Linguistic Boosting Factor. Extension type: XfDtShow. Factor: MixMatchWeightedValue by stream BQPRSample. Maximum weighted value of the factor by extensions.
Linguistic Boosting Factor. Extension type: XfDtShow. Factor: MixMatchWeightedValue by Stream DoubleFrc. Maximum weighted value of the factor by extensions.
DSSM model trained on clicks.
DSSM model trained on clicks.
DSSM model trained on clicks.
DSSM model trained on clicks.
DSSM model trained on clicks.
DSSM model trained on clicks.
DSSM model trained on clicks.
DSSM model trained on clicks.
Neural model of content quality for medical topics
request came from yandsearch (rearr.is_desktop == 1)
request came from touchsearch (rearr.is_mobile == 1)
request came from padsearch (rearr.is_tablet == 1)
request came from device with Android OS (rearr.dd_osfamily == Android)
request came from device with iOS (rearr.dd_osfamily == iOS)
request came from device with Windows OS (rearr.dd_osfamily == Windows)
request does not come from devices with Android, iOS or Windows OS (rearr.dd_osfamily != [Android, iOS, Windows])
A broken embedded video on the page.
FullMatchValue factor over hits from CorrectedCtrLongPeriod stream
MixMatchWeightedValue factor over hits from CorrectedCtrLongPeriod stream
AnnotationMaxValueWeighted factor over hits from CorrectedCtrLongPeriod stream
AnnotationMatchWeightedValue factor over hits from CorrectedCtrLongPeriod stream
AllWcmMatch95AvgValue factor over hits from CorrectedCtrLongPeriod stream
AllWcmMatch80AvgValue factor over hits from CorrectedCtrLongPeriod stream
AllWcmWeightedValue factor over hits from CorrectedCtrLongPeriod stream
AllWcmWeightedPrediction factor over hits from CorrectedCtrLongPeriod stream
Neural model of content quality for medical topics (for exponents)
BclmMixPlainKE5 factor over hits from NHopSumDwellTime stream
Match80AvgValue factor over hits from NHopSumDwellTime stream
Neural model of content quality for financial and legal topics
MixMatchWeightedValue factor over hits from NHopSumDwellTime stream
Neural model of content quality for financial and legal topics (for exponents)
BclmMixPlainKE5 factor over hits from FirstClickDtXf stream
FullMatchValue factor over hits from FirstClickDtXf stream
AnnotationMaxValueWeighted factor over hits from FirstClickDtXf stream
AnnotationMatchWeightedValue factor over hits from FirstClickDtXf stream
BclmPlaneProximity1Bm15W0Size1K001 factor over hits from FirstClickDtXf stream
Linguistic Boosting Factor. Type of extensions: RequestWithRegionName. Bm11 by document text and title
Linguistic Boosting Factor. Type of extensions: RequestWithRegionName. CosineMatchMaxPrediction by document text and title
Linguistic Boosting Factor. Extension type: RequestWithRegionName. Factor: AnnotationMatchWeightedValue by Stream LongClick.
Linguistic Boosting Factor. Extension type: RequestWithRegionName. Factor: FullMatchValue by Stream OneClick.
Linguistic Boosting Factor. Extension type: RequestWithRegionName. Factor: AnnotationMatchValue by Stream OneClick.
Linguistic Boosting Factor. Extension type: RequestWithRegionName. Factor: AnnotationMatchWeightedValue by stream LongClickSP.
Linguistic Boosting Factor. Extension type: RequestWithRegionName. Factor: FullMatchValue by stream LongClickSP.
Linguistic Boosting Factor. Extension type: RequestWithRegionName. Factor: AnnotationMaxValueWeighted by Stream BQPRSample.
Linguistic Boosting Factor. Extension type: RequestWithRegionName. Factor: Bm15 by stream group 1.
Linguistic Boosting Factor. Extension type: RequestWithRegionName. Factor: Bm15 by stream group 2.
Linguistic Boosting Factor. Extension type: RequestWithRegionName. Factor: BclmWeightedFLogW0 by Streaming Group 3.
Linguistic Boosting Factor. Extension type: RequestWithRegionName. Chain0Wcm factor by document text
Random float in [0,1] by user request and document
Neural model of content quality for sos topics
the ratio of the total area of all Flash blocks to the screen area
Neural model of content quality for sos topics (for expos)
Copy of old version No.294 factor. Added for use on L3 stage only. Coverage of the domain with three letters from the query. (Chelyabinsk lottery - chelloto. Translate query into transliteration, find three-letter words that are covered (che, hel, lot, olo), see what proportion of all three letters are covered)
Fast version of FI_URL_DOMAIN_FRACTION
Prediction of the session timestamp subject to the implementation of this query-document pair
Request-document dssm that predicts document sobriety
The document is a selection from the /tag ticktock
The document is a selection from the /discovery tiktok
The document is a selection from the /music tiktok
The request-document model of synsig.
Factor on the original query. Calculated using tokenized url. CosineMatchMaxPrediction algorithm.
Factor on the original query. Calculated by tokenized url. Weight of the hit is multiplied by 1/ (1 + position of the word in the sentence) Algorithm of word weights aggregation: Bm15. The normalization coefficient is 0.5.
Factor by original request. It is counted by document title. The word weights aggregation algorithm is BclmMixPlain: a linear mixture of annotation Bclm weight and weighted Positionless word weight, then word counters are aggregated through bm15. The normalization factor is 10^(-5).
Factor by original request. It is counted from the title of the document. CMMatchTop5AvgMatchValue algorithm.
Factor by original request. It is counted by the title of the document. Degree of query word coverage with exact form (without synonyms).
Factor by original request. It is counted by the title of the document. Weight of the hit is multiplied by 1/ (1 + position of the word in the sentence) Algorithm of word weights aggregation: Bm15. Normalization coefficient 0.5.
Factor in the original request. It is counted by the content of the document. The word weights aggregation algorithm is BclmMixPlain: a linear mixture of annotation Bclm weight and weighted Positionless word weight, then word counters are aggregated through bm15. The normalization factor is 10^(-5).
Factor in the original request. It is calculated from the contents of the document. CosineMatchMaxPrediction algorithm.
Factor in the original request. Calculated from the contents of the document. AllWcmWeightedPrediction algorithm.
Factor in the original request. It is counted by the content of the document. Algorithm of word weights aggregation Bocm15. The normalization coefficient is 0.01.
Factor in the original request. It is counted by the contents of the document. Algorithm: QueryPartMatchSumValueAny.
Factor in the original request. It is counted by the content of the document. Degree of query word coverage with exact form (without synonyms).
Factor in the original request. It is counted by the content of the document. The degree of coverage of the query words in the exact form.
Factor in the original request. It is counted by the content of the document. Scale Aggregation Algorithm: Bm15MaxAnnotation Normalization Factor 0.01.
Url is a channel/post from a verified social network account
Dssm, predicting whether a site is a mimicry
CMMatch80AvgValue factor over hits from QueryDwellTime stream
CMMatchTop5AvgMatch factor over hits from DoubleFrc stream
PerWordCMMaxMatchMin factor over hits from OneClickFrcXfSp stream
PerWordCMMaxMatchMin factor over hits from FirstClickDtXf stream
Linguistic Boosting Factor. Extension type: XfDtShow. Factor: PerWordCMMaxMatchMin by stream LongClickSP. Maximum weighted value of the factor by extensions.
Linguistic Boosting Factor. Extension type: XfDtShow. Factor: PerWordCMMaxMatchMin by Stream OneClick. Maximum weighted value of the factor by extensions.
Linguistic Boosting Factor. Extension type: XfDtShow. Factor: PerWordCMMaxMatchMin by stream FirstClickDtXf. Minimum weighted value of the factor by extension top.
Distance from the city from which the request was made to Ankara
Distance from the city where the request was made to Magadan

Latitude

#1311
Geographical latitude of the city from where the request was made
Geographical longitude of the city from where the query was made
FullMatchValue factor over hits from LongClick stream (Mobile sessions filtered)
CosineMatchMaxPrediction factor over hits from LongClick stream (Mobile sessions filtered)
AnnotationMatchWeightedValue factor over hits from LongClick stream (Mobile sessions filtered)
AllWcmMatch95AvgValue factor over hits from LongClick stream (Mobile sessions filtered)
AllWcmWeightedValue factor over hits from LongClick stream (Mobile sessions filtered)
AllWcmWeightedPrediction factor over hits from LongClick stream (Mobile sessions filtered)
CMMatchTop5AvgValue factor over hits from LongClick stream (Mobile sessions filtered)
Bm15MaxAnnotationK001 factor over hits from LongClick stream (Mobile sessions filtered)
Linguistic Boosting Factor. Extension type: XfDtShow. Factor: PerWordCMMaxMatchMin on incoming links. Maximum weighted value of the factor by extensions.
Static URL factor by search sessions for 1600 days. Normal Ctr.
Static URL factor by search sessions over 1600 days. Average DwellTime, and DwellTime from session is truncated if more than 3600 seconds
Static URL factor by search sessions over 1600 days. Average DwellTime, and DwellTime from session is truncated if more than 180 seconds
Static URL factor by search sessions for 1600 days. Probability that the click on the URL will be more than 120 seconds
Static URL factor by search sessions for 1600 days. Logarithm of the number of hits.
Static URL factor by search sessions over 1600 days. The probability that the URL will be clicked if at least one URL is not clicked is higher.
Static URL factor by search sessions over 1600 days. The probability that the URL will not be clicked if at least one URL is clicked is lower.
Static URL factor by search sessions for 1600 days. Normal Ctr. Localization to country level.
Static URL factor by search sessions over 1600 days. Average DwellTime, with DwellTime from session truncated if more than 3600 seconds. Localization to country level.
Static URL factor by search sessions for 1600 days. The probability that the click on the URL will be more than 120 seconds. Localization to country level.
Static URL factor by search sessions for 1600 days. Average URL position for all queries. Localization to the country level.
Static URL factor by search sessions for 1600 days. Logarithm of the number of impressions. Localization to the country level.
DSSM model trained on clicks. Takes bigrams into account.
MixMatchWeightedValue factor over hits from FirstLastClick stream (Mobile sessions filtered)
CosineMatchMaxPrediction factor over hits from FirstLastClick stream (Mobile sessions filtered)
FullMatchValue factor over hits from FirstLastClick stream (Mobile sessions filtered)
AllWcmMatch95AvgValue factor over hits from FirstLastClick stream (Mobile sessions filtered)
CMMatchTop5AvgValue factor over hits from FirstLastClick stream (Mobile sessions filtered)
AllWcmWeightedValue factor over hits from FirstLastClick stream (Mobile sessions filtered)
Was the request made by voice
AllWcmWeightedValue factor over hits from AvgDTWeightedByRankMobile stream (Mobile sessions filtered)
AllWcmMatch95AvgValue factor over hits from AvgDTWeightedByRankMobile stream (Mobile sessions filtered)
CMMatchTop5AvgValue factor over hits from AvgDTWeightedByRankMobile stream (Mobile sessions filtered)
AnnotationMatchWeightedValue factor over hits from AvgDTWeightedByRankMobile stream (Mobile sessions filtered)
FullMatchValue factor over hits from AvgDTWeightedByRankMobile stream (Mobile sessions filtered)
MixMatchWeightedValue factor over hits from AvgDTWeightedByRankMobile stream (Mobile sessions filtered)
Linguistic Boosting Factor. Type of extensions: XfDtShow. Factor: AvgPerTrigramMaxValueAny by stream group 5. Weighted average value of the factor by the top of extensions.
AvgPerTrigramAvgValueAny factor by CorrectedCtrLongPeriod Stream
DSSM model trained on clicks. Takes bigrams into account. Embeddings for documents are computed offline.
The quality rank of the texts on the host. The higher - the more likely that the host is full of articles - rewrite, bad copywriting, ordered on content exchanges. It burns harder as query aggregation.
Minimum from gradients according to the bigram LogDwelltime model.
Maximum of the gradients according to the bigram LogDwelltime model.
The second central point (variance) from the gradients according to the bigram LogDwelltime model.
The third central point from the gradients according to the bigram LogDwelltime model.
The probability that vk.com host is popular for this query according to the corresponding dssm model.
The probability that the onliner.by host is popular for this query according to the corresponding dssm-model.
The probability that the host rambler.ru is popular for this query according to the corresponding dssm-model.
The probability that the host expertcen.ru is popular for this query according to the corresponding dssm-model.
The probability that the host sunhome.ru is popular for this query according to the corresponding dssm-model.
Static URL factor by browser logs for a maximum period. Percentage of traffic from social networks in all traffic from other hosts and search.
Static URL factor by browser logs for maximum period. Average number of direct descendants from the host spent more than 90 seconds on it. The descendant is direct only if there is a link from our page to the descendant and it was clicked.
Static URL factor by browser logs over maximum period. The average maximum tree depth with the root in the current URL when the URL is visited from other hosts.
Static URL factor by browser logs over maximal period. The number of times the page has been accessed from the serp divided by the total number of pages accessed from the serp. The closer to 1, the more times the page was opened as the only page in the session.
Static URL factor by browser logs for the maximum period. Average length of search sessions, when the page was navigated to from the serp
Static URL factor by browser logs for maximal period. See the wiki for the formula to calculate the factor.
Static URL factor by browser logs for maximal period. See the wiki for the formula to calculate the factor.
Static URL factor by browser logs for maximal period. Probability that the user will spend > 120 seconds on the page.
Static URL factor by browser logs for maximum period. The number of leaves in the URL subtree. In this case leaves are pages from which there were no jumps.
Static URL factor from browser logs for the maximum period. The average time spent on the page and in all the descendants of the page (URLs to which were navigated) from the host. Cut if total Dt is more than 10 minutes
Static URL factor by browser logs for maximal period. Minimum unix time when page first appeared in logs.
Static URL factor by browser logs for maximal period. Difference between average and minimum unix time when page appeared in logs.
Static URL factor by browser logs for maximum period. The average latitude from where the page was viewed.
Static URL factor by browser logs for maximum period. Average longitude from where the page was viewed.
Static URL factor by browser logs for maximum period. Probability of download from page
Static URL factor by browser logs for maximum period. Probability of image download from page
Static URL factor by browser logs for maximum period. Probability of downloading torrent file from page
Static URL factor by browser logs for maximal period. See wiki for factor calculation formula. Localization to country level.
Static URL factor by browser logs for maximum period. The number of leaves in the URL subtree. In this case leaves are pages from which there were no jumps. Localization to the country level.
Static URL factor from browser logs for the maximum period. The average time spent on the page and in all the descendants of the page (URLs to which were navigated) from the host. Cut if total Dt is more than 10 minutes. Localization to country level.
The sum of the query's scoring words according to the 3grams-yandex-direct language model.
The sum of the query's scoring words according to the web-mt language model.
Static URL factor based on browser logs over a maximum period. Rank, based only on UBLP counters, which allows to find many SBR losses
Linguistic Boosting Factor. Extension type: Qfuf. Factor: BclmWeightedFLogW0_K0.001 by FieldSet3. Weighted average of factor values by top-10 extensions.
Linguistic Boosting Factor. Extension type: QueryToText. Factor: by MinWindowSize by document content. Weighted average of factor values by extensions.
Linguistic Boosting Factor. Average weight of extensions of QueryToText type.
Linguistic Boosting Factor. Extension type: Qfuf. Factor: MixMatchWeightedValue by QueryDwellTime stream. Weighted average of the factor values by extensions.
Linguistic Boosting Factor. Extension type: QueryToText. Factor: MinWindowSize by document content. Weighted average of the factor values by the top 10 extensions.
Linguistic Boosting Factor. Extension type: Qfuf. Factor: Bm15FLogW0_K0.0001 by url and header. Maximum value of the factor by extensions.
Linguistic Boosting Factor. Extension type: Qfuf. Factor: BclmWeightedFLogW0_K0.001 by FieldSet3. Weighted average of factor values by extensions.
Linguistic Boosting Factor. The average weight of Qfuf-type extensions.
Linguistic Boosting Factor. Extension type: QueryToText. Factor: PairMinProximity by document content. Average of factor values by extensions.
Linguistic Boosting Factor. Type of extensions: Qfuf. The renormalized total weight of the extensions.
Linguistic Boosting Factor. Extension type: QueryToText. Factor: Bocm11_Norm256 by document text. Average value of the factor by extensions.
Linguistic Boosting Factor. Extension type: Qfuf. Factor: CosineMatchMaxPrediction by document text. Maximal value of the factor by extensions.
Linguistic Boosting Factor. Extension type: Qfuf. Factor: Bm15FLog_K0.001 by FieldSet1. Weighted average of factor values with quadratic weight by the top 10 expansions by factor value.
Linguistic Boosting Factor. Type of extensions: Qfuf. Factor:Bocm11_Norm256 by document text. Maximal value of the factor by extensions.
Linguistic Boosting Factor. Extension type: Qfuf. Factor: Bm15FLogW0_K0.0001 by url and header. Weighted average of factor values by extensions.
DSSM model trained on clicks, target=OneClicks/Clicks. Takes bigrams into account.
DSSM model trained on clicks, target=QueryDwellTime stream value. Takes bigrams into account.
The normalized sum of the weights of the query words that occurred in the text of the document or links to it.
The normalized sum of the query word weights that EQUAL_BY_STRING in the document text or links to it.
The normalized sum of the weights of the query words that appeared in the text of the document.
The normalized sum of the weights of the query words that appeared in the links to the document.
The normalized sum of the query word weights that EQUAL_BY_STRING in the document references.
The normalized sum of weights by IFiltrationModel of query words that were encountered in the text of the document or references to it.
The normalized sum of weights by IFiltrationModel for query words that EQUAL_BY_STRING in the document text or links to it.
The normalized sum of weights by IFiltrationModel of query words that EQUAL_BY_LEMMA in the document text or links to it.
The normalized sum of weights by IFiltrationModel of query words, which occurred in links to the document.
Normalized sum of weights by IFiltrationModel of query words that are EQUAL_BY_STRING in document references.
Linguistic Boosting Factor. Type of extensions: Qfuf. Aggregation by all extensions. Highest factor value. By stream from LinkAnnIndicator link index. Algorithm AnnotationMaxValueWeighted - maximum weight (by MainWeights word weights) of annotation coverage, weighted by annotation weight
Linguistic Boosting Factor. Type of extensions: Qfuf. Aggregation by all extensions. Highest factor value. By stream from LinkAnnIndicator link index. Algorithm AnnotationMaxValueWeighted - maximum weight (by MainWeights word weights) of annotation coverage, weighted by annotation weight
Linguistic Boosting Factor. Type of extensions: XfDtShow. Aggregation by all extensions. Largest weighted value of the factor. Normalized to the maximum weight of the extension. Based on stream from LinkAnnIndicator link index. PerWordCMMaxMatchMin algorithm: minimum CMMaxMatch weight by words.
Linguistic Boosting Factor. Type of extensions: XfDtShowKnn (closest by dssm-model, trained to predict XfDtShow extensions). Aggregation over all extensions. Highest weighted factor value. A mixture of multiple streamlines, the weight is computed from a fixed polynomial of the component weights on a given annotation. The word weights aggregation algorithm is BclmMixPlain: a linear mixture of the annotation Bclm weight and the weighted Positionless word weight, then the word counters are aggregated via bm15. The normalization factor is 10^(-5).
Linguistic Boosting Factor. Type of extensions: XfDtShowKnn (closest by dssm-model, trained to predict XfDtShow extensions). Aggregation over all extensions. Highest weighted factor value. Normalized to the maximum weight of the extension. Stream: CorrectedCtrLongPeriod. Degree of query word coverage with exact form (without synonyms).
Linguistic Boosting Factor. Type of extensions: Qfuf. Aggregation over all extensions. Largest weighted value of the factor. Normalized to the maximum weight of the extension. Vpcg result for long long period, data: CorrectedClicks. Average weight of the annotations among those in which the query turned out to be an exact substring.
Linguistic Boosting Factor. Type of extensions: XfDtShowKnn (closest by dssm-model, trained to predict XfDtShow extensions). Aggregation over all extensions. Largest weighted factor value. Normalized to the maximum weight of the extension. Stream: CorrectedCtrLongPeriod. Algorithm BclmPlaneProximity1Bm15W0Size1: uses bclm with weightless weighting if there are multiple query words, if there is one word then the match-weighted sum of hits is used. The normalization coefficient is 0.001.
Linguistic Boosting Factor. Type of extensions: XfDtShowKnn (closest by dssm-model, trained to predict XfDtShow extensions). Aggregation over all extensions. Average weight of extensions.
Document dssm model language classifier rus.
Document dssm model language classifier eng.
Document dssm model language classifier other.
Predicting the DSSM model to determine irrelevant Alice responses
The average value of News by request for the year. Calculated offline.
The average value of AddTime by request for the year. Calculated offline.
The average value of TxtHiRelSy for the query per year. Calculated offline.
The average TextLike value per query for the year. Calculated offline.
The average HasNoAllWordsTRSy value by query per year. Calculated offline.
The average IsForum on the request for the year. Calculated offline.
The average HasPayments on request for the year. Calculated offline.
Average value of YabarHostAvgTime2 per request per year. Calculated offline.
The average value of YabarUrlVisitors by request for the year. Calculated offline.
Average value of QueryDOwnerOnlyClickRate by request per year. Calculated offline.
The average DaterAge on request for the year. Calculated offline.
The average value of LongestText by query for the year. Calculated offline.
The average value of DifferentInternalLinks by query per year. Calculated offline.
Average value of QueryDOwnerOnlyClickRate_Reg by query per year. Calculated offline.
The average IsHub value per query per year. Calculated offline.
Average BM25_0 on request for the year. Calculated offline.
The average Bocm on demand for the year. Calculated offline.
The average IsIndexPage value for the query per year. Calculated offline.
The average value of QueriesAvgCM2 per request per year. Calculated offline.
Average BrowserHostDownloadProbability by request per year. Calculated offline.
The average value of RegBrowserUserHub per query per year. Calculated offline.
The average value of AuxTitleBM25 on the request for the year. Calculated offline.
The average value of QueryUrlCorrectedCtrXfactor by query per year. Calculated offline.
Average QueryToDocAllSumFCountTextBm11Norm16384 by query for the year. Calculated offline.
Average value of XfDtShowAllSumWFSumWBodyMinWindowSize by request per year. Calculated offline.
The weighted average of the IsMainPage clicks per query per year. Calculated offline.
The click-weighted average of YabarUrlAvgTime on the request for the year. Calculated offline.
The click-weighted average of DifferentInternalLinks by query for the year. Calculated offline.
Weighted average dwelltime-amy value of UrlDomainFraction by query per year. Calculated offline.
BM25FdPR with normalization to the average document length depending on the document language. Only text hits are used.
Does owner have metrika or not
The document has a turbo page for the mobile platform.
Document annotations count in the whole history of the Search (DSSM AnnReg models helper)
Document annotation words count in the whole history of the Search (DSSM AnnReg models helper)
Document annotation regions count in the whole history of the Search (DSSM AnnReg models helper)
Query-MainContentKeywords similarity, target: logDwellTime
Maximum value of domain yellowness (based on Toloka)
Mean value of domain yellowness (based on Toloka)
Median of domain yellowness (based on Toloka)
Minimum value of domain yellowness (based on Toloka)
Dssm Boosting query self similarity for XfWeight model.
Dssm Boosting AvgTop02Score aggregation for XfWeight model over 5-means centroids.
Dssm Boosting AvgTop04Score aggregation for XfWeight model over 5-means centroids.
Dssm Boosting AvgTop02ScoreAvgClusterTop3Weighted aggregation for XfWeight model over 5-means centroids.
Dssm Boosting AvgTop02Score aggregation for XfWeight model over 5-means centroids (query as expansion).
Dssm Boosting AvgTop02ScoreAvgClusterTop3Weighted aggregation for XfWeight model over 5-means centroids (query as expansion).
Dssm Boosting query self similarity for XfOne model.
Dssm Boosting Score aggregation for XfOne model over 1-means centroids.
Dssm Boosting ScaledSumWeight aggregation for XfOne model over 1-means centroids.
Dssm Boosting Score aggregation for XfOne model over 1-means centroids (query as expansion).
Dssm Boosting ScoreAvgNearest1Weighted aggregation for XfOne model over 1-means centroids (query as expansion).
Dssm Boosting ScoreAvgNearest5Weighted aggregation for XfOne model over 1-means centroids (query as expansion).
Dssm Boosting Score aggregation for XfOneSe model over 1-means centroids.
Dssm Boosting ScoreScaledSumWeighted aggregation for XfOneSe model over 1-means centroids.
Dssm Boosting ScoreAvgNearest5Weighted aggregation for XfOneSe model over 1-means centroids.
Dssm Boosting query self similarity for Ctr model.
Dssm Boosting Score aggregation for Ctr model over 1-means centroids.
Dssm Boosting Score aggregation for Ctr model over 1-means centroids (query as expansion).
Dssm Boosting ScoreScaledSumWeighted aggregation for Ctr model over 1-means centroids (query as expansion).
Dssm Boosting ScoreAvgNearest1Weighted aggregation for Ctr model over 1-means centroids (query as expansion).
Yellowness distribution dispersion of domain (based on Toloka)
The vpcg result for the long long period, data: CorrectedClicks. FullMatchPrediction Factor
The vpcg result for the long long period, data: CorrectedClicks. Factor AllWcmMatch95AvgValue
The vpcg result for the long long period, data: CorrectedClicks. Factor CMMatchTop5AvgValue
Result vpcg for the long long period, data: CorrectedClicks. Factor AnnotationMaxValueWeighted
Result vpcg for long long period, data: CorrectedClicks. Factor MixMatchWeightedValue
The vpcg result for the long long period, data: CorrectedClicks. Factor CMMatchTop5AvgPrediction
DSSM model trained on CTRs without miner.
Predicting dssm (url + title) trained on page_quality signal and embedded in RTHub, first slot.
Predicting dssm (url + title), trained on page_quality signal and embedded in RTHub, second slot.
The main components of request embedding from the DssmCtrNoMiner model
The main components of request embedding from the DssmCtrNoMiner model
The main components of request embedding from the DssmCtrNoMiner model
The main components of request embedding from the DssmCtrNoMiner model
The main components of request embedding from the DssmCtrNoMiner model
The main components of request embedding from the DssmCtrNoMiner model
DSSM model trained on click odd pool
DSSM model trained on click personalization pool
DSSM model trained on click triangle pool
Linguistic Boosting Factor. Extension type: RequestWithRegionName. Factor: CMMatchTop5AvgMatchValue by Stream FloatMultiplicity of the LinkAnn index
Linguistic Boosting Factor. Factor: PerWordAMMaxValueMin by Stream FloatMultiplicity of the LinkAnn index
Linguistic Boosting Factor. Factor: AttenV1Bm15K001 by Stream FloatMultiplicity of the LinkAnn index
Linguistic Boosting Factor. Factor: Bocm11Norm256 by Stream IsExternal of the LinkAnn index
Linguistic Boosting Factor. Extension type: RequestWithRegionName. Factor: AnnotationMaxValue by Stream FloatMultiplicity of the LinkAnn index
DSSM model trained on clicks without miner (with no-clicks and AM-hard negatives). Takes bigrams into account.
AVG aggregation of HasPayments web factor using random log
AVG aggregation of VideoQuery web factor using random log
AVG aggregation of SyntQuality web factor using random log
PERCENTALE_90 aggregation of GeoRegionalityVNew web factor using random log
AVG aggregation of QClassDownload web factor using random log
AVG aggregation of IsMusic web factor using random log
PERCENTALE_25 aggregation of QueryThEncyclopedic web factor using random log
AVG aggregation of CommercialOwnerRank_Reg web factor using random log
PERCENTALE_25 aggregation of YabarWordDepthNodesGradientMin web factor using random log
AVG aggregation of PopularSEFRCBrowser web factor using random log
AVG aggregation of URLClicksMaxGeoRegionFRCRatio web factor using random log
PERCENTALE_90 aggregation of UBLongPeriodDirectHChildren90CntFromExtHost web factor using random log
PERCENTALE_90 aggregation of UBLongPeriodDtUrlHChildrenCut600Reg web factor using random log
AVG aggregation of IsPicture web factor using random log
AVG aggregation of ErratumLogQueryProbability web factor using random log
Predicted by the query and country, using dssm-model the length of the click from the given country.
Predicted by the neural network average News on demand for the year.
Predicted by the neural network, the average AddTime value of the request for the year.
Predicted by the neural network average value of TxtHiRelSy on the query for the year.
Predicted by the neural network average TextLike value by query for the year.
Predicted by the neural network average HasNoAllWordsTRSy on the query for the year.
Predicted by the neural network of the average IsForum on the request for the year.
Predicted by the neural network average HasPayments by request for the year.
Predicted by the neural network average value of YabarHostAvgTime2 on the request for a year.
Predicted by the neural network average value of YabarUrlVisitors by query for the year.
Predicted by neural network average QueryDOwnerOnlyClickRate for the year.
Predicted by the neural network average DaterAge on demand for the year.
Predicted by the neural network average value of LongestText by query for the year.
Predicted by the neural network average DifferentInternalLinks by query for the year.
Predicted by the neural network average QueryDOwnerOnlyClickRate_Reg value for the year.
Type of canonized url of Yandex music - track
Predicted by the neural network average Bocm on demand for the year.
The average IsIndexPage value of the query for the year predicted by the neural network.
Predicted by the neural network average QueriesAvgCM2 value by query for the year.
The average BrowserHostDownloadProbability value per request per year predicted by the neural network.
Predicted by a neural network of the average RegBrowserUserHub value per query for the year.
Predicted by the neural network average AuxTitleBM25 for the query for the year.
Predicted by neural network average QueryUrlCorrectedCtrXfactor for the year.
Predicted by neural network average QueryToDocAllSumFCountTextBm11Norm16384 for the year.
Predicted by the neural network average XfDtShowAllSumWFSumWBodyMinWindowSize for the year.
Predicted by the neural network of the weighted average of clicks IsMainPage on the request for the year.
Predicted by the neural network weighted average by clicks YabarUrlAvgTime on the request for the year.
Predicted by the neural network of the average click-weighted value of DifferentInternalLinks by query for the year.
Predicted by neural network weighted average dwelltime-amy value of UrlDomainFraction by query for the year.
Linguistic Boosting Factor. Type of extensions: XfDtShowKnn. Factor: BclmWeightedFLogW0 by stream group 3. Maximum weighted value of the factor.
Linguistic Boosting Factor. Type of extensions: XfDtShowKnn. Factor: Bm15FLog by stream group 2. Maximum weighted value of the factor.
Linguistic Boosting Factor. Extension type: XfDtShowKnn. Factor: Bag OriginalRequestFraction by Stream FieldSetBagOfWords.
Linguistic Boosting Factor. Extension type: XfDtShowKnn. Factor: MixMatchWeightedValue by stream QueryDwellTime. Maximum weighted value of the factor normalized to the total weight.
Linguistic Boosting Factor. Extension type: XfDtShowKnn. Factor: Bm15 by Title stream. Total weighted values of the factor multiplied by the weight (\\frac{\\Sum W_i * (W_i * F_i)}{\\Sum W_i}) normalized by the total weight.
Linguistic Boosting Factor. Type of extensions: XfDtShowKnn. Factor: BclmWeightedFLogW0 by stream group 3. Minimum value of the factor by extension top.
Linguistic Boosting Factor. Type of extensions: XfDtShowKnn. Factor: BclmWeightedFLogW0 by stream group 3. Total weighted values of the factor multiplied by the weight (\\frac{\\Sum W_i * (W_i * F_i)}{\\Sum W_i}) normalized by the total weight.
Linguistic Boosting Factor. Type of extensions: XfDtShowKnn. Factor: Bm15FLog by stream group 1. Maximum weighted value of the factor.
Linguistic Boosting Factor. Type of extensions: XfDtShowKnn. Factor: Bm15FLog by stream group 1. Total weighted value of the factor normalized to the total weight.
Linguistic Boosting Factor. Extension type: XfDtShowKnn. Factor: Bag AnnotationMatchAvgValue by Stream LongClickSP.
Linguistic Boosting Factor. Type of extensions: XfDtShowKnn. Factor: Bm15FLog by stream group 1. Total weighted values of the factor multiplied by the weight (\\frac{\\Sum W_i * (W_i * F_i)}{\\Sum W_i}) on the expansion group normalized by the total weight on the expansion group.
Linguistic Boosting Factor. Type of extensions: XfDtShowKnn. Factor: Bm15FLog by stream group 1. Minimum weighted value of the factor on the extension top normalized to the maximum weight on the extension top.
Linguistic Boosting Factor. Extension type: XfDtShowKnn. Factor: PairMinProximity by Stream Body. Maximum weighted value of the factor normalized to the total weight.
Linguistic Boosting Factor. Type of extensions: XfDtShowKnn. Factor: Bm15FLog by stream group 1. Total weighted values of the factor multiplied by the weight (\\frac{\\Sum W_i * (W_i * F_i)}{\\Sum W_i}) normalized by the total weight.
Linguistic Boosting Factor. Extension type: XfDtShowKnn. Factor: Bag AnnotationMatchAvgValue by Stream SimpleClick.
Linguistic Boosting Factor. Extension type: XfDtShowKnn. Factor: Bag CosineMaxMatch by Stream Title.
Predicting the probability that the query is localizable according to the Regionality5 rule.
Document has Fio from original request
Factor for experiments Page Quality 1
DSSM model trained on clicks without miner (with no-clicks and am_hard negatives 50/50 and then on am_hard negatives only). Takes bigrams into account.
Dssm Boosting Score aggregation for XfOneSeAmSsHard model over 1-means centroids.
Dssm Boosting ScoreAvgClusterTop3Weighted aggregation for XfOneSeAmSsHard model over 1-means centroids.
Factor for experiments Page Quality 2
Average by url maximum yellowness of teaser image
Average by url average yellowness of teaser image
Ratio of yellow images in teasers on host
Average yellow images count on host
Average teasers count on host
Average teasers area on host
Average by url minimum yellowness of teaser text
Average by url average yellowness of teaser text
Background is clickable advertisement
Average ratio of adverts on screen
Ratio of adverts on screen on main page
Average count of adverts on screen
Ratio of outgoing advertisement traffic to all traffic (desktop)
Ratio of outgoing real-time bidding traffic to all traffic (desktop)
Rating of news agency from agencies.json (Yandex.News resource)
Linguistic Boosting Factor. Extension type: QueryToTextByXfDtShowKnn. Factor: Norm256 by stream Bocm11. Total weighted factor values multiplied by weight (\\frac{\\Sum W_i * (W_i * F_i)}{\\Sum W_i}).
Linguistic Boosting Factor. Extension type: QueryToTextByXfDtShowKnn. Factor: MinWindowSize by Stream Body. Total weighted factor values multiplied by weight (\\frac{\\\Sum W_i * (W_i * F_i)}{\\Sum W_i}) by the extension top normalized by the total weight by the extension top.
Linguistic Boosting Factor. Extension type: QueryToTextByXfDtShowKnn. Factor: MinWindowSize by Stream Body. Total weighted factor values multiplied by weight (\\frac{\\Sum W_i * (W_i * F_i)}{\\Sum W_i}) normalized by total weight.
Linguistic Boosting Factor. Extension type: QueryToTextByXfDtShowKnn. Factor: Norm256 by stream Bocm11. Total weighted factor values multiplied by weight (\\frac{\\Sum W_i * (W_i * F_i)}{\\Sum W_i}) by extension top.
Linguistic Boosting Factor. Extension type: QueryToTextByXfDtShowKnn. Minimal extension weight.
Linguistic Boosting Factor. Type of extensions: QueryToTextByXfDtShowKnn. The arithmetic mean of the weights of the extensions.
Linguistic Boosting Factor. Type of extensions: QueryToTextByXfDtShowKnn. Total weight of extensions.
Linguistic Boosting Factor. Extension type: QueryToTextByXfDtShowKnn. Factor: Bag OriginalRequestFraction by Stream FieldSetBagOfWords.
Factor for Page Quality 3 experiments
Characterizes the query by the degree of change from adding a fixed word (number of some year), uses dssm model DssmBoostingXfOneSeAmSsHard
Characterizes the query by the degree of change from adding a fixed word ('online' for Cyrillic), uses the dssm model DssmBoostingXfOneSeAmSsHard
Characterizes the query by the degree of change from removing a fixed word ('site' for Cyrillic), uses the dssm model DssmBoostingXfOneSeAmSsHard
Document from shards with fresh
For each word offline the average HasNoTr value is calculated for the queries for 3 months. Then the maximum of this value is taken for all query words.
The average IsLJ value is calculated for each word in the offline query over 3 months. Then the maximum of this value is taken for all query words.
The average BclmLite value is calculated for each word in the offline query over 3 months. Then the minimum of this value is taken for all query words.
For each word offline the average DBM40 for the queries for 3 months is calculated. Then for all non-stop query words the maximum of this value is taken.
For each word offline the average value of IsDesktopRequest for queries over 3 months is calculated. Then the maximum of this value is taken for all non-stop query words.
The average RLQAvgHasNoAllWordsTrSyn value is calculated for each word in the offline query over 3 months. Then the maximum of this value is taken for all query words.
The average DssmAggregatedAnnReg value is calculated for each word in the offline query for 3 months. Then the maximum of this value is taken for all query words.
For each word offline the average value of MetaNumUrlsPerHostFixed by queries for 3 months is calculated. Then the maximum of this value is taken for all query words.
For each word offline the average value of MaxSDIsNavMxQueryMax is calculated for the queries for 3 months. Then for all non-stop query words the maximum of this value is taken.
AVG aggregation of VisitsFromWiki web factor using random log
Factor for experiments Page Quality 4
PERCENTALE_25 aggregation of NavLinear web factor using random log
PERCENTALE_90 aggregation of Found web factor using random log
AVG aggregation of SubqueryThMatch web factor using random log
Factor for experiments Page Quality 5
AVG aggregation of SegmentWordPortionFromMainContent web factor using random log
AVG aggregation of XfDtShowAllMaxFFieldSet2Bm15FLogK0001 web factor using random log
AVG aggregation of QueryRegionSize web factor using random log
The document came from WebTier1
AVG aggregation of IsRelevLocaleUA web factor using random log
PERCENTALE_90 aggregation of QfufAllSumWFSumWFieldSet3BclmWeightedFLogW0K0001 web factor using random log
PERCENTALE_90 aggregation of DssmBoostingCtrQuerySelfSimilarity web factor using random log
AVG aggregation of QueryToDocAllSumFCountTextBocm11Norm256 web factor using random log. NOTE: QueryToDocAllSumFCountTextBocm11Norm256 has been removed.
PERCENTALE_90 aggregation of IsNavMxQuery web factor using random log
The document came from Platinum0
AVG aggregation of DBM15Wares2 web factor using random log
PERCENTALE_90 aggregation of UrlNGramsModel web factor using random log
A neural document model for finding unexpected tin
Medical host quality fresh.
PERCENTALE_25 aggregation of DssmBoostingCtrKMeans1ScoreScaledSumWeightedQE web factor using random log
PERCENTALE_90 aggregation of LongClickMobileAllWcmWeightedValue web factor using random log
PERCENTALE_25 aggregation of DssmVkPopularity web factor using random log
AVG aggregation of UBLongPeriodVisitsSNProb web factor using random log
PERCENTALE_90 aggregation of CountryQueryRegionality web factor using random log
PERCENTALE_90 aggregation of TRhitw web factor using random log
PERCENTALE_90 aggregation of UBLongPeriodAvgSearchDuration600 web factor using random log
AVG aggregation of RequestIsFromIOS web factor using random log
PERCENTALE_90 aggregation of DssmQueryEmbeddingCtrNoMinerPca4 web factor using random log
AVG aggregation of XfDtShowAllMaxFFieldSetUTBm15FLogW0 web factor using random log
PERCENTALE_25 aggregation of UrlTrigrams web factor using random log
PERCENTALE_90 aggregation of DssmQueryEmbeddingCtrNoMinerPca1 web factor using random log
AVG aggregation of IsRelevLocaleKZ web factor using random log
PERCENTALE_90 aggregation of TextFeatures web factor using random log
1 if host include js from marketgid.com
1 if host include js from rfity.com
DSSM prediction of google specificity for query
Site owner pays attention to site details (at least once in quarter)
Chat info. positive / events or zero
Host player info. Relation between view time and video duration
1 if host include js from google-analytics.com
1 if host include js from googleapis.com
1 if host include js from facebook.net
1 if host include js from mc.yandex.ru
Average value of RandomLogQueryAvgAddTime of the closest knn queries.
Average value of RandomLogQueryAvgTxtHiRelSy of the nearest knn queries.
Average value of RandomLogQueryAvgTextLike of the closest knn queries.
Average value of RandomLogQueryAvgIsForum of the queries closest to knn.
Average value of RandomLogQueryAvgHasPayments of the nearest knn queries.
Average value of RandomLogQueryAvgDifferentInternalLinks of the closest knn queries.
Average value of RandomLogQueryAvgIsTargetBussinessCard of the nearest knn queries.
Average value of RandomLogQueryAvgQueryToDocAllSumFCountTextBm11Norm16384 of the nearest knn queries.
Average value of RandomLogQueryAvgXfDtShowAllSumWFSumWBodyMinWindowSize of the nearest knn queries.
Host speed estimation
Is site official
Quality link from good sites estimation
Weight sum of each non-unique nevasca shingle
Nevasca shingle quantity in last week
Greentraffic share (aka direct visits). Desktop
Greentraffic share (aka direct visits). Mobile
Greentraffic absolute (desktop)
Visits averaged by user
1 if video on page
Stream PCtrNew from yandex video
Stream PCtrNew from yandex video
Stream PCtrNew from yandex video
Stream PCtrNew from yandex video
Stream PCtrNew from yandex video
Stream PCtrNew from yandex video
The document has a turbo page. It depends on the platform
Medical host quality for metric.
Initial query with verb removal. It is counted by the title of the document. Algorithm of word weights aggregation: Bm15. The normalization coefficient is 0.1.
Initial query with verb removal. It is computed from a compassionate stream consisting of a tokenized url and a document title. Algorithm of word weights aggregation: Bm15FLogW0. The normalization coefficient is 0.0001.
Original query with verb removal. It is counted by the contents of the document. The minimum size of the window in which all query words are included. Normalized by the number of words in the query.
Initial query with verb removal. Calculated using tokenized url. Algorithm of word weights aggregation: Bm15. The normalization coefficient is 0.1.
RMSE aggregation of Long web factor using random log
RMSE aggregation of IsOrg web factor using random log
RMSE aggregation of GskUrlModel web factor using random log
RMSE aggregation of DaterStatsAverageSourceSegment web factor using random log
RMSE aggregation of VisitsFromWiki web factor using random log
RMSE aggregation of XfDtShowBagOfWordsTitleCosineMaxMatch web factor using random log
RMSE aggregation of UBLongPeriodDownloadsProb web factor using random log
RMSE aggregation of MetaAvgIsNotCgi meta factor using random log
RMSE aggregation of MetaRmsSynPercentBadWordPairs meta factor using random log
RMSE aggregation of MetaPosTrigramsProb meta factor using random log
PERCENTALE_90 aggregation of Bocm web factor using random log
PERCENTALE_90 aggregation of SegmentWordPortionFromMainContent web factor using random log
PERCENTALE_90 aggregation of IsMobileBeauty web factor using random log
PERCENTALE_90 aggregation of USLongPeriodUrlWinsProb web factor using random log
PERCENTALE_90 aggregation of DssmBoostingXfWeightKMeans5AvgTop02ScoreQE web factor using random log
PERCENTALE_90 aggregation of DssmBoostingCtrKMeans1Score web factor using random log
PERCENTALE_90 aggregation of SDIsNavMxQueryMax meta factor using random log
PERCENTALE_90 aggregation of MetaWeb764Web1076ProductInvAvg meta factor using random log
PERCENTALE_90 aggregation of MetaWeb1099Web1219ProductInvPos meta factor using random log
PERCENTALE_90 aggregation of MetaMaxDssmMiddleVsShortLongHardNoClicks meta factor using random log
MAX aggregation of NumLinksFromMP web factor using random log
MAX aggregation of NavLinear web factor using random log
MAX aggregation of DaterStatsAverageSourceSegment web factor using random log
MAX aggregation of WeightedSumIsIndexPageIsNavMxQuery web factor using random log
MAX aggregation of QueryToDocAllSumFCountTextBocm11Norm256 web factor using random log. NOTE: QueryToDocAllSumFCountTextBocm11Norm256 has been removed.
MAX aggregation of DssmBigramsQueryDerivativeMax web factor using random log
MAX aggregation of DssmQueryCountryToUrlEstimatedDistance web factor using random log
MAX aggregation of MetaWeb764Web1076ProductInvAvg meta factor using random log
LOGAVG aggregation of TextFeatures web factor using random log
LOGAVG aggregation of DocLen web factor using random log
LOGAVG aggregation of IsHTML web factor using random log
LOGAVG aggregation of HasLevensht1QueryFragment web factor using random log
LOGAVG aggregation of HeadingIdfSumFixed web factor using random log
LOGAVG aggregation of AdvPronounsPortion web factor using random log
LOGAVG aggregation of LongestText web factor using random log
LOGAVG aggregation of CountryHour web factor using random log
LOGAVG aggregation of MetrikaUrlAvgTime web factor using random log
LOGAVG aggregation of WikiLinkCount web factor using random log
LOGAVG aggregation of BrowserUrlDwellTimeRegionFrc web factor using random log
LOGAVG aggregation of WikiInfobox web factor using random log
LOGAVG aggregation of QueryDocTitleRangesMatchingScore web factor using random log
LOGAVG aggregation of IsMobileBeauty web factor using random log
LOGAVG aggregation of QueryToTextAllSumWFSumWBodyMinWindowSize web factor using random log
LOGAVG aggregation of DssmRandomLogQueryAvgDifferentInternalLinks web factor using random log
LOGAVG aggregation of MetaUrlDirectChildrenCnt meta factor using random log
LOGAVG aggregation of MetaWeb1241Web1299ProductInvPos meta factor using random log
LOGAVG aggregation of MetaEpsHashShareNationalLanguage meta factor using random log

Is Https

#1764
The document has the https protocol
The Levenshtein distance between the query and the url of the form youtubecom/watch normalized to the maximum of the length of the query and the url
The length of the longest common substring between the url and the query normalized to the query length
The sigmoid normalized value of the porn text query classifier as estimated from Toloka
Binarized value of the porn text query classifier by estimates from Toloka
The [0,1] value of the porn text query classifier as estimated by the web classifier and additional dictionaries
Binarized using fxlists text query classifier porn value by web classifier estimates and additional dictionaries
The presence of foul language in the query. 0 - absent, 0.5 - not hard, 1 - hard
Presence of porn markers in the query (0 - yes, 1/3 - no, 1 - query 'gray')
Document Classification of Pornography, Fiches by Document Text
Document pornography classifier, document url based features
Document classifier for pornography, document image-based features (information is taken from the Picture Index)
Document classifier for pornography, chips by video document (information is taken from the Video index)
Host pornography classifier, chips about pornography queries that were shown and clicked on by the host
The presence of the word official in a lemmatized query
Presence of the word wikipedia in a lemmatized query
The presence in a lemmatized query of the word not and similar in meaning
The presence in the lemmatized query of the words buy, price and similar in meaning
The return factor on the host. Percentale aggregation with 0.25f of DwellTimeSumFraction
The document came from QuickMed
Return Factor per host. Percentale aggregation with a factor of 0.99f of the AverageReturnTime chip
Return Factor per host. Percentale aggregation with a factor of 0.97f of the AverageReturnTime fic
Return Factor per host. GreaterFraction aggregation with 0.99f of fic AverageReturnTime
The return factor on the host. Percentale aggregation with a factor of 0.99f of the AverageLogReturnTime chip
The return factor on the host. GreaterFraction aggregation with 0.9f of AverageLogReturnTime
Returns factor on the host. LessFraction aggregation with 0.05f fic FirstClickDwellTime
Host return factor. WeightedAverage aggregation of AverageVisitsPer3Hours
Medical host quality.
The document has a turbo page for the desktop platform. Updates on top of the base are delivered via saas.
Host return factor. WeightedAverage aggregation of AverageDwellTimePerHour feature
The return factor on the host. LessFraction aggregation with 0.1f of fic AverageDwellTimePer3Hours
Host return factor. Max aggregation of AverageDwellTimePerWeek feature
The median dwelltime of the request over the entire history. The dwelltime is truncated to 6000. The query is normalized by doppelgangers
The number of query hits with more than one click in the whole history. The query is normalized by doppelgangers
Share of displays of the query with more than one click from all displays for the whole history. The query is normalized by doppelgangers
Owner aggregation of RandomLogWordMaxMetaNumUrlsPerHostFixed web factor using random log, aggregation type is PERCENTALE_90
Owner aggregation of MetaWeb1099Web1219ProductInvPos meta factor using random log, aggregation type is LOGAVG
Owner aggregation of DssmDwelltimeRegChainTrainedEmbedding meta factor using random log, aggregation type is PERCENTALE_90
Owner aggregation of DssmRandomLogQueryAvgHasPayments web factor using random log, aggregation type is LOGAVG
Owner aggregation of UBLongPeriodBrowseFrc web factor using random log, aggregation type is PERCENTALE_90
Owner aggregation of MetaUrlChildrenCnt meta factor using random log, aggregation type is LOGAVG
Owner aggregation of MetaRmsDifferentInternalLinks meta factor using random log, aggregation type is PERCENTALE_25
Owner aggregation of RandomLogWordMaxHasNoTr web factor using random log, aggregation type is PERCENTALE_90
Owner aggregation of MetaResidUSLongPeriodUrlWinsProb meta factor using random log, aggregation type is RMSE
Owner aggregation of PornoQuery web factor using random log, aggregation type is LOGAVG
Owner aggregation of NationalLanguage web factor using random log, aggregation type is LOGAVG
Owner aggregation of PercentVisibleContent web factor using random log, aggregation type is PERCENTALE_90
Owner aggregation of MetaWeb1241Web1299ProductInvPos meta factor using random log, aggregation type is PERCENTALE_25
Owner aggregation of LinkAnnFloatMultiplicityAttenV1Bm15K001 web factor using random log, aggregation type is LOGAVG
Owner aggregation of UBLongPeriodLeavesCnt web factor using random log, aggregation type is RMSE
Owner aggregation of NumLinksFromMP web factor using random log, aggregation type is LOGAVG
Owner aggregation of DssmRandomLogQueryAvgDifferentInternalLinks web factor using random log, aggregation type is PERCENTALE_25
Owner aggregation of IsOrg web factor using random log, aggregation type is RMSE
Owner aggregation of QSegmentsBM25 web factor using random log, aggregation type is MAX
Owner aggregation of SegmentAuxAlphasInText web factor using random log, aggregation type is RMSE
Owner aggregation of RandomLogQueryDwelltimeWeightedAvgUrlDomainFraction web factor using random log, aggregation type is LOGAVG
Owner aggregation of RandomLogWordSkipStopWordsMaxIsDesktopRequest web factor using random log, aggregation type is LOGAVG
Owner aggregation of VisitsFromWiki web factor using random log, aggregation type is RMSE
Owner aggregation of IsText web factor using random log, aggregation type is RMSE
Owner aggregation of DBMSubstantive web factor using random log, aggregation type is MAX
Owner aggregation of DaterStatsAverageSourceSegment web factor using random log, aggregation type is RMSE
Owner aggregation of IsMobileBeauty web factor using random log, aggregation type is LOGAVG
Owner aggregation of LongClickSPMixMatchWeightedValue web factor using random log, aggregation type is PERCENTALE_90
Owner aggregation of FemAndMasNounsPortion web factor using random log, aggregation type is LOGAVG
Owner aggregation of TrigramsProb web factor using random log, aggregation type is PERCENTALE_90
Owner aggregation of DaterStatsYearNormLikelihood web factor using random log, aggregation type is PERCENTALE_25
Owner aggregation of UrlPathAndParamsFraction web factor using random log, aggregation type is MAX
The average value for the query factor according to QueryToText linguobusting, calculated in the LingBoostQueryFeatures behemoth rule
The average value for the query factor according to QueryToTextByXfDtShowKnn lingvobusting, calculated in the LingBoostQueryFeatures behemoth rule
sum / (sum + 10) for the query factor according to XfDtShow lingvobusting, calculated in the LingBoostQueryFeatures behemoth rule
Quantile 0.1 for query factor according to XfDtShow lingvobusting, calculated in behemoth rule LingBoostQueryFeatures
Quantile 0.1 for query factor according to XfDtShowKnn lingvobusting, calculated in behemoth rule LingBoostQueryFeatures
Quantile 0.9 for query factor according to XfDtShowKnn lingvobusting, calculated in LingBoostQueryFeatures behemoth rule
sum / (sum + 10) for the query factor according to Qfuf lingvobusting, calculated in the LingBoostQueryFeatures behemoth rule
The average value for the query factor according to Qfuf lingvobusting, calculated in the LingBoostQueryFeatures behemoth rule
The site is located in the Tas-IX network (relevant to Uzbekistan)
Dssm Boosting Score for SerpSimilarityHard model over 1-means centroids.
Page quality aggregated by host (avg).
relev_local == uz
25% quantile of time from the previous query before the current query. The query is normalized by doppelgangers
The result of applying a neural model trained to distinguish long clicks from other events, the input of the model are word and bigram counters, calculated from text streamlines (Title, Body, Url).
Is this host adapted for mobile devices
Linguistic Boosting Factor. Extension type: QfufFilteredByXfOneSe (qfuf, filtered by dssm-model XfOneSe). Aggregation over all extensions. Highest factor value. Weighted stream aggregation of Url, Title, Body, CorrectedCtr, LongClick, OneClick, BrowserPageRank, SplitDwellTime, SamplePeriodDayFrc, SimpleClick, YabarVisits, YabarTime. Word weights aggregation algorithm: Bm15FLog (Bm15 aggregation of word occurrence logarithms). The normalization coefficient is 0.001.
Linguistic Boosting Factor. Extension type: QfufFilteredByXfOneSe (qfuf, filtered by dssm-model XfOneSe). Aggregation over all extensions. Highest factor value. Weighted aggregation of Title, Body, LongClick, LongClickSP, OneClick streamlines. Algorithm of word weights aggregation: BclmWeightedFLogW0. Normalization coefficient 0.001.
The linguistic boosting factor. Extension type: QfufFilteredByXfOneSe (qfuf, filtered by dssm-model XfOneSe). Aggregation over all extensions. Highest factor value. Counted by a compassionate stream consisting of a tokenized url and a document title. Algorithm of word weights aggregation: Bm15FLogW0. The normalization coefficient is 0.0001.
Linguistic Boosting Factor. Extension type: QfufFilteredByXfOneSe (qfuf, filtered by dssm-model XfOneSe). Aggregation over all extensions. Highest factor value. Counted by document title. Algorithm of word weights aggregation: Bm15. Normalization coefficient 0.1.
The linguistic boosting factor. Type of extensions: QfufFilteredByXfOneSe (qfuf, filtered by dssm-model XfOneSe). Aggregation by top-10 (by factor value) extensions. Weighted sum of factor weights. Normalized by total weight of extensions. Weighted stream aggregation of Url, Title, Body, CorrectedCtr, LongClick, OneClick, BrowserPageRank, SplitDwellTime, SamplePeriodDayFrc, SimpleClick, YabarVisits, YabarTime. Word weights aggregation algorithm: Bm15FLog (Bm15 aggregation of word occurrence logarithms). The normalization coefficient is 0.001.
The linguistic boosting factor. Type of extensions: QfufFilteredByXfOneSe (qfuf, filtered by dssm-model XfOneSe). Aggregation by top-10 (by factor value) extensions. Weighted sum of factor weights. Normalized by total weight of extensions. Calculated by document content. Minimum window size that includes all query words. Normalized by the number of words in the query.
Factor on filtered original query: dssm-distance from query without words to original query is calculated, then cutoff by threshold. Weighted stream aggregation Url,Title,Body,Links,CorrectedCtr,LongClick,OneClick,BrowserPageRank,SplitDwellTime,SamplePeriodDayFrc,SimpleClick,YabarVisits,YabarTime. Word weights aggregation algorithm: Bm15FLog (Bm15 aggregation of word occurrence logarithms). The normalization coefficient is 0.001.
Factor on filtered original query: the dssm-distance from query without words to the original query is calculated, followed by a threshold cutoff. It is computed by compassed stream, consisting of tokenized url and document header. Word weight aggregation algorithm: Bm15FLogW0. The normalization coefficient is 0.0001.
DSSM model trained on cross language CTRs using serp similarity hard miner.
For all words of the query weights are calculated by the query-mutation method (distance between queries in the presence and absence of a word). We take the sum of weights of words found in the title, divided by the sum of weights of all words.
For all query words, the weight is calculated using the query-mutation method (the distance between queries if a word is present or absent). The maximum weight among the words missing in the document title is taken.
The result of applying a neural model trained to distinguish long clicks from other events, the input of the model are word and bigram counters, calculated from text streamlines (Body, Url).
Calculated as (80-x) where x is the document's age in hours (continuous). Uses data from the RobotAddTime dater
Calculated as (10-x) where x is the document's age in days (continuous). Uses data from the RobotAddTime dater
The difference between the current date and the date of the document defined by RobotAddTime, 1 - the date is equal to the current date, 0 - the document is 10 days or more, or the date is not defined
Linguistic Boosting Factor. Type of extensions: XfOneSeKnn (closest by dssm-model, trained to predict XfDtShow extensions). Aggregation over all extensions. Highest weighted factor value. Normalized to the maximum weight of the extension. Weighted aggregation of stream Url,Title,Body,Links,CorrectedCtr,LongClick,OneClick,BrowserPageRank,SplitDwellTime,SamplePeriodDayFrc,SimpleClick,YabarVisits,YabarTime. Word weights aggregation algorithm: Bm15FLog (Bm15 aggregation of word occurrence logarithms). The normalization coefficient is 0.001.
Linguistic Boosting Factor. Type of extensions: XfOneSeKnn (closest by dssm-model, trained to predict XfDtShow extensions). Aggregation over all extensions. Highest weighted factor value. Normalized to the maximum weight of the extension. TODO Algorithm: Maximum weight of the fully matched query annotation. Calculated by OneClick stream.
Linguistic Boosting Factor. Type of extensions: QueryToTextByXfOneSeKnn (QueryToText extensions XfOneSeKnn). Aggregation by top-10 (by factor value) extensions. Weighted sum of factor weights. Normalized by total weight of extensions. Calculated by document content. Minimum window size that includes all query words. Normalized by the number of words in the query.
Linguistic Boosting Factor. Extension type: QueryToTextByXfOneSeKnn (QueryToText extensions XfOneSeKnn). Aggregation over all extensions. Weighted sum of factor weights. Normalized by total weight of extensions. Weighted aggregation of Title, Body, LongClick, LongClickSP, OneClick strips. Word weights aggregation algorithm: BclmWeightedFLogW0. Normalization coefficient 0.001.
Domain in the international zone
The request was recognized as having an interest in copyrighted works protected by the Anti-Piracy Memorandum.
The host contains pirated videos protected by the Anti-Piracy Memorandum.
host contains videos protected by the Anti-Piracy Memorandum.
Average host freshness over 30 days
Proportion of documents with positive freshness surplus from the host in 30 days
Stevenson
Stevenson
The renormalized predicate ethos classifier by markup on the relevance of the video.
Renormalized ethos predictor of the classifier trained on the synthetic sample 'query is typical for a pirate site' vs 'query is typical for a site far from it'
there has never been a non-zero feature in this slot
Regression on dssm embeddings to separate memorandum and non-memorandum requests
A renormalized ethos predicate of a classifier trained to distinguish memorandum queries from random
Regression on dssmembeds to separate pirate-specific and non-pirate-specific queries
DSSM model, which predicts the logarithm of the longest click on the serpent. As negative examples, we choose urls from past queries of the same user, with a maximum time between queries of no more than 7 minutes (superhards on reformulations)
The document came from Quick but not from QuickRt
The document came from QuickRt
The document came from Callisto
Feature LegalPlayers from VideoIndex
Feature SocialNetworksPlayers from VideoIndex
Feature StevensonPlayers from VideoIndex
DSSM model with early binding, trained on reformulations, which predicts the logarithm of the longest click on the serpent.
Rating of news agency from agencies.json > 0 (Yandex.News resource)
Weekday query probability
Indicator of the quality of the site in terms of factors about user behavior, aggregated to the owners.
Neural network value for contexts of query hits in document text. Predicts relevance-all-8-years. Uses formula ussr-dump-20190719 prs-20190720 all-8-years [t > 0.25] CrossEntropy 20k 0.25 -S 0.8 -Z 1 predictions for learning.
Bans of Antispam from erf
DSSM model trained on the reformulation pool, which in the query part besides the query itself receives 4 XfDt extensions with the highest weight
Aggregated by the closest on the host LogAvg-statistics of the IsMobileRequest factor
LogAvg-statistics of the NanobtaniumQueryWordTitle5nDist2maxXMax factor aggregated by the closest urls on the host
Bans on gsm of Antispam from erf
Bans on fresh of Antispam from erf
The average IsBlog by query for the year. Calculated offline.
The document has a turbo page for the mobile platform. Updates on top of the base are delivered via saas.
The document has a turbo page for the desktop platform. Updates on top of the base are delivered via saas.
Model trained on prediction estimate formula ussr-dump-20190719 prs-20190720 all-8-years [t > 0.25] CrossEntropy 20k 0.25 -S 0.8 -Z 1.
The 'random' factor for commercial sites.
Neural document model for finding unexpected tin (for exps)
Features calculated on url with request multitokens expansion
Features calculated on url with request multitokens expansion
Model trained on prediction estimates by formula ussr-dump-20190719 prs-20190720 all-8-years [t > 0.25] CrossEntropy 20k 0.25 -S 0.8 -Z 1 and pre-trained on relevance estimates.
The share of queries that showed the owner's face among all queries that showed the owner in the last week.
Percentage of visits from the document sickle that are at 0 hops. Over 30 days.
The average position of the owner on the queries for the last week.
The ratio of mobile to desktop by search engine traffic.
Mobile-to-desktop ratio for all outbound traffic.
The average value of the query factor isorg for queries with the given owner for the last week.
The average ratio of punctuation to all separators in the owner's documents.
The value of the freshness detector calculated in behemoth. Always 0 when the detector value is less than the threshold.
host contains videos protected by the Anti-Piracy Memorandum.
Stevenson
legal privacy