Dave Cinege Git Repo thesaurus / cce7589 README.rst
cce7589

Tree @cce7589 (Download .tar.gz)

README.rst @cce7589view markup · raw · history · blame

Python Thesaurus and ThesaurusCfg

Python Thesaurus - "A different way to call a dictionary."

Copyright (c) 2012-2019 Dave Cinege. All rights reserved.

Licensed under the Apache License, Version 2.0.

See the end of this file for further copyright and license information.

Source code:https://git.cinege.com/thesaurus
Telegram Group:https://t.me/PythonThesaurus

The Thesaurus family works with Python 2.6+ to 3.8+.

About

Thesaurus is a mapping data type with recursive keypath map and attribute aliasing. It is a subclass of dict() and is mostly compatible as a general use dictionary replacement.

Thesaurus prefers to be imported and called as thes() in the same way you would use dict(). Thesaurus likes to think of itself as a Python data type primative that should be used along side of dict(), similar to the relationship between list() and tuple(); they are overlappingly similar but serve different purposes and diverge to incompatibilities. Thesaurus is currently content with not having it's own data type tokens. Hopefully so are you.

ThesaurusExtended is a subclass of Thesaurus providing additional usability methods such as recursive key and value searching.

ThesaurusCfg is a subclass of ThesaurusExtended providing a nested key configuration file parser and per key data coercion methods.

Quick Start

$ python thesauruscfg_sample.py

Then review thesauruscfg_sample.py and thesauruscfg_sample.cfg and you will get an idea what Thesaurus and ThesaurusCfg are all about.

Simple Code Examples

# Thesaurus Basics
from thesaurus import thes, Keypath
t = thes()
t.set_path('a.b.c.d', 'Hello')
print(t['a']['b']['c']['d'])        # as nested keys
print(t.a.b.c.d)                    # attribute aliasing
print(t.a.b['c'].d)                 # as both
print('The value: {a.b.c.d}'.format(**t))
print(f'The value: {t.a.b.c.d}')    # py3.6+ f-string. perfection!

kp = Keypath('a.b.c.d')             # a keypath object
print(t[kp])                        # keypath as recursive map

>>> 'a.b.c.d' in t                  # recursive contains
True
>>> kp[:-2] in t                    # keypath slicing! (== 'a.b' in t)
True
>>> print(kp)                       # keypath's str() dotted
'a.b.c.d'
>>> print(repr(kp))
['a', 'b', 'c', 'd']                # but it's really like a list

# ThesaurusExtended
from thesaurus import thesext
te = thesext()
te.set_path('a.b.c.d', 'Hello')
print(te.a.b.c[0])                  # as numeric index or slice
for kp in te.get_keys('Hello'):     # a list of matching keypath's is returned
    print('Parent key:', kp[-1])

# ThesaurusCfg
from thesauruscfg import thescfg
cfg = thescfg()
s = '''
prog.version(static_int)    = 123
opt.verbose (str_to_bool)   = yes
hi                          = Hello
'''
# coercion methods can intercept parse, dump, and setitem
>>> cfg.parse(s)
{'prog': {'version': 123}, 'opt': {'verbose': True}, 'hi': 'Hello'}
>>> cfg.dump()
'prog.version = 123\nopt.verbose = True\nhi = Hello\n'
>>> cfg.prog.version = '123214'     # static_int coercion method
>>> cfg.prog.version                # won't allow value change
123

import json
>>> print(json.dumps(cfg, indent=4, separators=(',', ': ')))
{
    "prog": {
        "version": 123
    },
    "opt": {
        "verbose": true
    },
    "hi": "Hello"
}

Currect State of Code - 2019-11-13

Thesaurus first came about in December 2012. It has gone through a total of maybe 6 complete re-writes. In mid 2019 I think I finally got it right with the implemintation of Keypath's and also the creation of ThesaurusCfg, something I've been wishing I've had to use for myself for at least the last 4 years.

The current version you will find here has had the recent addition of Keypaths and while it is quite usable might still have a few bugs to iron out.

Additonally parts should be cleaned and reimplemented now that Keypath's have been implemented. If you are reviewing the code, this will likely stand out. With that said, please remember that at this stage few things have been done by accident. There are certainly sections that can be cleaned for readablity, however they are this way, on purpose, for performance.

ThesaurusCfg is quite new. It works. I have a version of it frozen in production code. But it's not had the maturity of Thesaurus in practise and requires a more thorough re-write.

I'm putting this code out now to gain feedback to finalize for a proper release. As such, both modules are very subject to change at this time. Specifically I have the following decisions to make:

  • Finalize name conventions. Use set_path() or setpath()?

  • Finalize specialtiy method names. Are merge(), mesh() and screen() good names that make sense? I am struggling with the ThesaurusExtended search method names myself.

  • Review how I do recursive copy/deepcopys.

  • Decide how to properly handle copying ThesaurusCfg coercion methods.

  • Resolve Thesaurus's schizophrenia:

    t = thes()
    t.set_path('a.b.c', 'Hi')
    v = t['a.b.c']           # This recurses
    t['a.b.c'] = 'Bye'       # This set a dotted keyname
    v = t['a.b.c']           # v == 'Bye', not 'Hi'
    >>> t
    {'a': {'b': {'c': 'Hello'}}, 'a.b.c': 'Bye'}
    

    I want to be comfortable this feels natural to others or find a better way.

Early Release Description from 2013

Thesaurus is a pure dictionary subclass which allows calling keys as if they are class attributes and will search through nested objects recursively when __getitem__ is called.

You will notice that the code is very compact. However I have found that this has completely changed the way I program in Python. I've re-written some exiting programs using Thesaurus, and often realized 15-30% code reduction. Additionally I find the new code much easier to read.

If you find yourself programing with nested dictionaries often, fighting to generate output or command lines for external programs, or wish you had a dictionary that could act (sort of) like a class, Thesaurus may be for you.