🏠
Author: garykac.bsky.social (did:plc:u5jkht56qtj56xaesfxwwtlj)

Record🤔

uri:
"at://did:plc:u5jkht56qtj56xaesfxwwtlj/app.bsky.feed.post/3l4jofdqox32f"
cid:
"bafyreieixgvuzd2c4kpxd4zsridfxof22namaf3ttg3y23ecyzexiwfy44"
value:
text:
"Wordfreq project (python library to get word frequency counts for various natural languages) is no longer being updated because Generative AI has polluted the data sets. It will maintain a snapshot up to 2021 since data after that cannot be verified/trusted.

github.com/rspeer/wordf..."
$type:
"app.bsky.feed.post"
embed:
$type:
"app.bsky.embed.images"
images:
  • alt:
    "Why wordfreq will not be updated
    
    The wordfreq data is a snapshot of language that could be found in various online sources up through 2021. There are several reasons why it will not be updated anymore.
    
    Generative AI has poluted the data"
    image:
    View blob content
    $type:
    "blob"
    ref:
    $link:
    "bafkreiheidxxar45oxubi5zpgphyyqpbmmycx7px4xh2zhysisisgzmefy"
    mimeType:
    "image/jpeg"
    size:
    200208
    aspectRatio:
    width:
    1528
    height:
    536
langs:
  • "en"
facets:
createdAt:
"2024-09-19T18:28:25.715Z"