Awesome
What is MARC::MIR
The current repo comes with 2 things that must be clearly seperated at some point:
- A specification of in memory, acmeic representation of MARC records.
- A Perl implementation of this spec. with
- a very fast ISO2709 serialization/deserialization module
- very nice set of helpers that make MIR manipulations very easy.
see lib/MARC/MIR/Tutorial.pod for futher informations.
MARC::MIR out of the Perl world
-
MARC::MIR python port was written during pycon.fr 2013. I also seen a proto of the MARC::MIR::Template port.
-
I wrote (and have to release) a port of MARC::MIR in ... java :) because it was an excuse to learn about NIO2 and it would give us opportunity to use MIR from clojure and Perl6. As i have no usage of MARC::MIR anymore, motivation is hard to find.
Quo vadis, MARC::MIR ?
First steps of this project came fast and promising before it was discontinued about 2 years ago (last CPAN release in 2013 january).
-
All those code were preliminaries of a big project to merge 4 ILS at Strasbourg University. This project was canceled as the university decided to use commercial solutions to achieve this work.
-
Since then, i occasionally work on MARC records and MARC::MIR is usable enough.
-
I didn't have the occasion to work directly with librarians to setup a complex use case and ensure the whole toolchain can be used without the help of a programmer
-
As i really expect MARC (as well as some librarians) to die, i had no interest on working on this project. even on spare time.
So if someone want to take over my modules, please feel free. Also, there are 2 ways to increase my own motivation:
- invite me to a workshop with some librarians to work on usecases (the mechanics is in good enough shape to come to the battlefield)
- donate or hire me as contractor (the whole todo list, including documentations and test suites, is about 1 full-time month, some tasks are just few hours).
Todo
- make a clear separation between specs and implementation
- complete the reference documentation
- write/improve tutorials
- complete test suites
- it SHOULD be done an acmeic way
- it MUST be battle tested in the case of records written from scratch (as it's the case i almost never had to handle)
- finish canceled projects
Canceled projects
normalize, validate and query MARC records
As a MIR is just an array of array, very simple validations can be written using any existing tool which can query/validate a datastructure. I gave Kwalify a try, you can find the [eg/validate](documented source).
Although, some other pages deserve further reads and tests:
- rx
- json-schema
- (perl implementation)
- quick ref from npm
- with interesting disgressions like metawidget
- jsonary
- JSON validation combinators
But i guess this method would consume a lot of ressources to validate or query
a large set. Plus, rules can be really boring to write if you come with some
cases rarely seen out of the MARC world. What it you want to find a title
matching "rome", "romulus" or "remus" (assuming title
can be 200$a
, 200$b
or 200$c
).
For those range of problems, i use MARC::MIR helpers directly from Perl
use MARC::MIR;
use Perlude;
sub keep_roman_books {
print if
grep $_
, map_values { /rome|romulus|remus/ }
[ 200, [qw( a b c )] ]
, from_iso2709
}
Also i can't expect librarians to do so. that's why we need more tools whom
- MUST be acmeic (usable from any technology or programming langage)
- MUST be GUI driven (it would be possible to write rules from a web interface)
- MAY come with a DSL: intensive users may probably go crazy with web interfaces.
S-expr would be perfect. not only because it will be very easy to implement and port but also because it would be a very pleasant to read and edit
(any 200$[abc] (~ "rome|romulus|remus"))
Both GUI and DSL would store a query as a datastructure that can be used by validation tools. The first to come idea would be (written in YAML).
any:
- field: 200
- subfield: [a, b, c]
- value: [match, "rome|romulus|remus" ]
I made a lightning talk of a prototype during the French Perl Workshop 2013 and a nicer version i now use daily for other purposes is in my github.
ISO-5426 from/to UTF-8
I wrote a ICU Unicode Charmap and tested it using Perl encoding system. It was very fast but we had to deal with some unexpected symbols so i didn't release the whole thing.
Now it's pretty clear unexpected symbols come from our ILS extension of the charmap. We came to the idea to write a 'ISO-5426-loose-unistra' table but we never took time for it.
MARC::MIR::Template
MARC::MIR::Template works very well if you don't care about field indicators. I never had to deal with those indicators so "it worked for me".
Also, there is a known bug but i documented the work around and never got time to investigate on it. I wrote a test in a suite that can be uncommented in the case someone fix it.