Skip to content

Releases: ssc-oscar/python-woc

v0.4.1

30 Nov 07:18

Choose a tag to compare

  • Fixes: decode_value hangs when decoding a long string that contains non-utf8 characters.

Full Changelog (2025-11-30)

v0.4.0

05 Nov 03:52

Choose a tag to compare

  • python-woc is now fork safe! That means you can do this:
from multiprocessing import Pool
from woc.local import WocMapsLocal

woc = WocMapsLocal()

def worker(idx):
    return woc.show_content("tree", idx)

with Pool(8) as pool:
    for r in pool.imap(worker, ["706aa4dedb560358bff21c3120a0b09532d3484d",
    "3ccf6f8320740a1afec68b38b3b9ba46cedef368",
    "e5798457aebae7c84eff7b80b50c3a938cc4cb63",
    "836f04d5b374033b1608269e2f3aaabae263a0db",
    "f54cb5527226aa2096307c08e15c62248b98f763",
    "da65e1401d11a955686b8a49e46b9a457f3febab",
    "a28f1558be9867d35cc1fa17477565c08786cf83",
    "4db2ad30097924cbe5da9c0f2c49350fdc19c3a4",
    "1cf86145b4a9492ebbe0fa640638504946315ca6",
    "29a422c19251aeaeb907175e9b3219a9bed6c616",
    "51968a7a4e67fd2696ffd5ccc041560a4d804f5d"]):
        print(r)
  • When on_bad is "error", python-woc raises an KeyError when querying bad keys.
>>> from woc.local import WocMapsLocal
>>> woc = WocMapsLocal(on_bad='error')
>>> woc.get_values("c2p","0a36c08880da83a84209efe5aa90ca3f9b1dc453")
KeyError: 'Key 0a36c08880da83a84209efe5aa90ca3f9b1dc453 is marked as bad: tons of fake blobs'

Bad keys are stored in wocprofile.json, and you need to regenerate the profile to reflect this change.

{
  "bads": {
    "p": {
      "bitzhoumy_helloworld": "damaged trees",
      "thachmai_mobiliPlay": "single repo for torvalds alias"
    },
    "c": {
      "3f631f976149d8702d0b1496df7b98f16a9357ed": "2013166 blobs",
      "14bde94da008ac1c65e0c066ee269315e47c0987": "Completed Search Engine with Cosine Similarity and Champion Lists, storing the entire inverted index in terms, with each term having its own pickle file."
    }
  }
}
  • Fixes a bug where RootProject.commits queries the wrong map.

Full Changelog (2025-11-05)

v0.3.2

01 Oct 04:09

Choose a tag to compare

  • Supports python up to 3.13
  • Fix data type detection for po2pn & b2fa

Full changelog (2025-09-30)

v0.3.1

07 Aug 13:47

Choose a tag to compare

Added support for:

  • c2tag
    commit to tag
  • b2cff
    file renames by git mv, blob -> (commit, old_file, new_file)
  • commit/tree.tch
    get_values('tree.tch','X') = show_content('tree','X')

Full changelog (2025-08-07)

v0.2.6

26 Feb 20:42

Choose a tag to compare

  • Add support for tag (show_content) and c2tag (get_values)

Full Changelog: v0.2.5...v0.2.6

v0.2.5

21 Dec 02:04

Choose a tag to compare

  • Fixes entries not aligned to a multiple of 3 when parsing cs3 large maps, causing index error
  • Encoding fallbacks to latin-1 when chatdet fails
  • Fixes iter_values won't terminate when on_large='head'

Full list of changes (2024-12-21)

v0.2.4

20 Dec 22:33

Choose a tag to compare

  • Add iter_values: Iterates over values rather than consuming a list; can be useful when querying large maps.
  • Add all_keys: Iterates over keys in a map.
  • Switch from builtin gzip to rapidgzip for fast, random access to large maps.
  • exclude_large has been removed; use on_large='ignore' instead'.

v0.2.2

12 Dec 22:47
d520c5e

Choose a tag to compare

  • Project.save(): Download repository from World of Code. Missing binary blobs are retrieved on-demand from GitHub or GitLab.
  • Fixes blobs show_content on da servers.

v0.2.1

23 Aug 07:24

Choose a tag to compare

Add /home/wocprofile.json to paths.

Full Changelog: v0.2.0...v0.2.1

v0.2.0

15 Jul 09:05

Choose a tag to compare

Woc is a huuugee dataset of hundreds of files and hundreds of terabytes. To make sure everything is in good shape after transmission, wocProfile v2 adds an optional digest field to verify the integrity of each of the files.

To create a wocProfile with file digests:

python3 -m woc.detect /woc --with-digest > wocprofile.json

To verify the digests:

python3 -m woc.verify --profile wocprofile.json

This version does not break the old profile schema (v1).

Full Changelog: v0.1.2...v0.2.0