Martin A. Brown
7bb5e35aeb
add detail method to source and output documents
...
the detail method produces some output listing about the
document and will soon include more verbosity
2016-02-23 08:44:16 -08:00
Martin A. Brown
25add163cb
rely on LDPDocumentCollection
2016-02-22 20:02:12 -08:00
Martin A. Brown
cfd54d4524
pep8 fixes
2016-02-22 13:04:26 -08:00
Martin A. Brown
19f01a0a4c
begin refactoring of directory document handling
2016-02-19 00:55:53 -08:00
Martin A. Brown
d7fac8d65f
move logic from SourceCollection to scansourcedirs
...
moving the source dir scanning logic into a function (in preparation for
further refactoring of single-file or entire-directory source document
detection)
adapting tests (by changing the name from SourceCollection to scansourcedirs).
no other tests required
added new test to ensure that an empty SourceCollection() returned as expected
2016-02-18 23:31:18 -08:00
Martin A. Brown
7d3843c535
adding a bunch of docstring docs
2016-02-18 13:58:53 -08:00
Martin A. Brown
809ddc545b
adding # -*- coding: utf8 -*-
2016-02-18 13:25:02 -08:00
Martin A. Brown
605b57a1ea
sorted(), so dirs and docs are processed stably
...
use sorted() on the sourcedirs and the contents of each directory so that the
directories are always handled in order and the documents are also handled in
order
adjust logging also to refer to "Source collection dir" rather than just "dir"
2016-02-18 09:17:25 -08:00
Martin A. Brown
2550047d23
pep8/pyflakes corrections
2016-02-17 19:38:27 -08:00
Martin A. Brown
f5a65cf843
put stem in logging like many other logging lines
2016-02-17 19:03:37 -08:00
Martin A. Brown
9301a54ab2
switch to using statfiles
2016-02-17 11:19:48 -08:00
Martin A. Brown
c99dbefa92
shorter __repr__ can fit on one line
2016-02-17 08:35:53 -08:00
Martin A. Brown
dab2f1f8b1
adding support for documents to know their status
2016-02-17 00:17:49 -08:00
Martin A. Brown
f17d164b52
allow creation of empty SourceCollection; fixes
...
Allow creation of an empty SourceCollection, which can be handed around in the
driver to allow for higher-level document wrangling
fix bad, always-failing directory check (thank you, testing)
clarify handling of documents living in a directory and the generation of the
fileset
2016-02-16 23:40:09 -08:00
Martin A. Brown
ac44f5d577
refactor detection loop and identify duplicates
...
the nesting was deeper than necessary, so adjusting the detection of files
(and directories) and adding a bit more logging upon duplicate detection
2016-02-16 10:44:24 -08:00
Martin A. Brown
f6f6d4b543
only guess the doctype once
2016-02-16 00:23:58 -08:00
Martin A. Brown
7b08ececf4
renaming OutputDirs more appropriately to OutputCollection
2016-02-15 23:52:08 -08:00
Martin A. Brown
55ef688015
adjust SourceDirs to behave like a dictionary
2016-02-15 21:51:56 -08:00
Martin A. Brown
01bee4a269
adjust error-raising invocations (and tests)
2016-02-15 21:32:35 -08:00
Martin A. Brown
ae189e0d83
adjusting some logging and exceptions for verbosity/clarity
2016-02-15 21:15:29 -08:00
Martin A. Brown
7daefd03bd
renaming Sources to SourceDirs
2016-02-15 20:58:44 -08:00
Martin A. Brown
82a8a21d18
check for plain file type, too
2016-02-12 23:59:13 -08:00
Martin A. Brown
fe507461e8
found another typo while testing
2016-02-12 23:49:04 -08:00
Martin A. Brown
7affd10e0c
correct the reference to the renamed guess function
2016-02-12 13:24:21 -08:00
Martin A. Brown
92fdf8bec1
case-insensitive sorting is preferred
2016-02-12 12:53:30 -08:00
Martin A. Brown
ecf2bee8a6
removing unused sys; shortening logger lines
2016-02-12 12:42:58 -08:00
Martin A. Brown
fc9da80f6f
nicer visually to process a sorted set of files
2016-02-11 15:31:38 -08:00
Martin A. Brown
becb768929
handling multiple source dirs and renaming SourceDir to, simply Sources
2016-02-11 15:16:12 -08:00
Martin A. Brown
627e2ff636
changing to __future__ (consistency across project)
2016-02-11 11:28:38 -08:00
Martin A. Brown
6de9aee212
changing names to sourcedir and outputdir
2016-02-11 09:28:31 -08:00
Martin A. Brown
5adbb9af4c
removing the output elements from sources.py
2016-02-11 09:14:58 -08:00
Martin A. Brown
41bf2ef9c1
decreasing verbosity level on debugging logging
2016-02-11 08:17:04 -08:00
Martin A. Brown
b047b22470
better __repr__ and doctype @property
...
Include a better __repr__ for the SourceDocument object and
make the doctype attribute a @property
2016-02-11 08:12:16 -08:00
Martin A. Brown
701657e54b
initial commit
2016-02-10 19:22:23 -08:00