Advanced schema-aware streaming XML parser (fully typed output for TypeScript!)

塵心凡夢 8年前發布 | 11K 次閱讀 XML TypeScript

來自: https://github.com/charto/cxml

cxml

 

cxml aims to be the most advanced schema-aware streaming XML parser for JavaScript and TypeScript. It fully supports namespaces, derived types and substitution groups. It can handle pretty hairy schema such as GML , WFS and extensions to them defined by INSPIRE . Output is fully typed and structured according to the actual meaning of input data, as defined in the schema.

Introduction

For example this XML:

<dir name="123">
    <owner>me</owner>
    <file name="test" size="123">
        data
    </file>
</dir>

can become this JSON (run npm test to see it happen):

{
  "dir": {
    "name": "123",
    "owner": "me",
    "file": [
      {
        "name": "test",
        "size": 123,
        "content": "data"
      }
    ]
  }
}

Note the following:

  • "123" can be a string or a number depending on the context.
  • The name attribute and owner child element are represented in the same way.
  • A dir has a single owner but can contain many files, so file is an array but owner is not.
  • Output data types are as simple as possible while correctly representing the input.

See theexample schema that makes it happen. Schemas for formats like GML and SVG are nastier, but you don't have to look at them to use them through cxml .

Relevant schema files should be downloaded and compiled usingcxsd before using them to parse documents. Check out the example schema converted to TypeScript .

There's much more. What if we parse an empty dir:

import  as cxml from 'cxml';
import  as example from 'cxml/test/xmlns/dir-example';

var parser = new cxml.Parser();

var result = parser.parse('<dir name="empty"></dir>', example.document);</pre>

Now we can print the result and try some magical features:

result.then((doc: example.document) => {

console.log( JSON.stringify(doc) );  // {"dir":{"name":"empty"}}
var dir = doc.dir;

console.log( dir instanceof example.document.dir.constructor );   // true
console.log( dir instanceof example.document.file.constructor );  // false

console.log( dir instanceof example.DirType );   // true
console.log( dir instanceof example.FileType );  // false

console.log( dir._exists );          // true
console.log( dir.file[0]._exists );  // false (not an error!)

});</pre>

Unseen in the JSON output, every object is an instance of a constructor for the appropriate XSD schema type. Its prototype also contains placeholders for valid children, which means you can refer to a.b.c.d._exists even if a.b doesn't exist. This saves irrelevant checks when only the existence of a deeply nested item is interesting. The magical _exists flag is true in the prototypes and false in the placeholder instances, so it consumes no memory per object.

We can also process data as soon as the parser sees it in the incoming stream:

parser.attach(class DirHandler extends (example.document.dir.constructor) {

/** Fires when the opening <dir> and attributes have been parsed. */

_before() {
    console.log('Before ' + this.name + ': ' + JSON.stringify(this));
}

/** Fires when the closing </dir> and children have been parsed. */

_after() {
    console.log('After  ' + this.name + ': ' + JSON.stringify(this));
}

});</pre>

The best part: your code is fully typed with comments pulled from the schema!

Related projects

  • node-xml4js uses schema information to read XML into nicely structured objects.

License

The MIT License

Copyright (c) 2016 BusFaster Ltd

</article>

 本文由用戶 塵心凡夢 自行上傳分享,僅供網友學習交流。所有權歸原作者,若您的權利被侵害,請聯系管理員。
 轉載本站原創文章,請注明出處,并保留原始鏈接、圖片水印。
 本站是一個以用戶分享為主的開源技術平臺,歡迎各類分享!