Implementering av en COLLADA-parser kompatibel med GWT

(1)

Implementation of a COLLADA Parser Compatible with GWT

Jesper Corell 2015

Bachelor of Science Computer Engineering

Luleå University of Technology

(2)

asset (3D environment). These files usually don’t have support for rendering in 3D engines so some kind of transformation is needed between file formats.

Agency9 has the need of a new COLLADA parser, their old one has a few problems (mainly flexibility) that needs to be addressed. This new parser will ultimately transform COLLADA to A3X. A3X is a binary file format Agency9 have created themselves which has rendering support in their 3D engine.

This thesis was done at Agency9. The work that has been done consisted of mainly programming in Java implementing a parser system. This system is based on an existing XML parser.

The results were good and the parser is much more flexible than their current parser. The execution speed is twice as fast as their current parser.

COLLADA documents are usually comprehensive and consists of more then geometry (for example cameras and animation) and for this thesis a demarcation was to only focus on geometry and materials.

(3)

support I have received. I would also like to thank Adam L¨arkeryd and Martin B¨orjesson (author of A3X) who despite their busy schedule could give me some guidance from time to time. The whole team at Agency9 has been very supportive which I appreciate. Finally I’d like to thank my supervisor at Lule˚a University of Technology, Peter Parnes, for all the support I’ve received.

(4)

Contents

Terminology 3

General . . . . 3

3D . . . . 4

List of Figures 5 1 Introduction 7 1.1 Background . . . . 7

1.2 Problem Definition . . . . 9

1.3 Demarcation . . . . 10

1.4 Related Work . . . . 10

1.5 Thesis Structure . . . . 10

2 Theory and Proficiencies 11 2.1 XML . . . . 11

2.2 GWT . . . . 11

2.2.1 Compatibility . . . . 12

2.2.2 JSNI . . . . 12

2.3 COLLADA . . . . 13

2.3.1 Examples of Tags . . . . 13

2.4 A3X . . . . 15

3 Method 16 3.1 XML Parser Types . . . . 16

3.1.1 DOM Parsers . . . . 16

3.1.2 Event Parsers . . . . 17

3.2 XML Parser Implementation . . . . 18

3.2.1 SJSXP . . . . 18

(5)

3.2.2 Woodstox . . . . 18

3.2.3 Aalto . . . . 18

4 Implementation 20 4.1 System Design . . . . 20

4.2 Aalto . . . . 21

4.2.1 Synchronous Version . . . . 21

4.2.2 Asynchronous Version . . . . 22

4.2.3 Client-side . . . . 25

4.2.4 Server-side . . . . 25

4.3 Collada Parser . . . . 26

4.3.1 Parsing COLLADA Tags . . . . 26

4.3.2 Main Tags . . . . 29

4.3.3 Other tags . . . . 33

4.4 A3X Converter . . . . 34

4.4.1 Client-side . . . . 36

4.4.2 Server-side . . . . 36

4.5 Drag and Drop Functionality . . . . 38

4.6 The Rubber Duck . . . . 40

4.7 Multiple Meshes and Materials . . . . 43

5 Results 44 5.1 Solution . . . . 44

5.2 Performance . . . . 45

5.3 Evaluation . . . . 45

5.4 Future Work . . . . 47

5.5 Reflections . . . . 48

(6)

Terminology

General

.dae - COLLADA file extension

.fbx - 3D graphics file format created by Autodesk

.obj - 3D graphics file format created by Wavefront Technologies .xsd - XML Schema file extension

A3X - Agency9’s own file format. Can be used to render 3D models in their 3D engine.

API - Application Programming Interface, set of routine, tools and protocols for building software applications.

DOM - Document Object Model, XML parser API where a tree structure is created in memory which can then be accessed randomly.

GIS - Geographical information system, system designed for spatial and geographical data.

GWT - Google Web Toolkit, creates JavaScript code from Java classes.

HTML - Standard markup language used to create web pages.

JAXB - Java Architecture for XML Binding, maps Java classes to XML representations.

JAR - Java Archive, archive that consists of Java files.

(7)

JDK - Java Development Kit, set of resources that can be used in a Java application.

JSNI - JavaScript Native Interface, integrates handwritten JavaScript into Java source code

SAX - Simple API for XML, XML parser API where the parser ’pushes’

data to the application.

StAX - Streaming API for XML, XML parser API where the application ’pulls’ data from the parser.

XML - Extensive Markup Language, defines a set of rules for encoding documents.

XML Schema - Configuration file, describes the elements of an XML document.

3D

Material - Describes lighting and texturing of a mesh.

Matrix - Rectangular array of numbers.

Mesh - Collection of triangles. One 3D model may consist of several meshes.

Normal - Vector perpendicular to a surface (used for lighting).

Position - 3D-coordinate with y- x- and z-axis.

Renderable file - File that can be rendered in a 3D engine, for example .obj, .fbx, .a3x

Texture Coordinate - 2D-coordinate used for mapping a texture to a 3D model.

Transform - Matrix that is used to scale/rotate/translate a mesh/3D model

Triangle - Consists of 3 vertices which creates a flat surface.

(8)

Vertex - Point which consists of a position, normal and texture coordinate.

(9)

List of Figures

1.1 File chain from client to server. . . . 8

1.2 New file chain from client to server. . . . 9

2.1 XML example file . . . . 11

2.2 JSNI example source code . . . . 13

4.1 State diagram. . . . 20

4.2 Aalto synchronous code example. . . . 22

4.3 Aalto asynchronous code example. . . . 24

4.4 Parser class diagram. . . . 26

4.5 TagParser interface . . . . 27

4.6 Hashmap used in ColladaParser. . . 28

4.7 <float array> parser . . . . 30

4.8 Triangle indexing. . . . 31

4.9 Pseudocode for converting triangles from COLLADA to A3X. 32 4.10 A3XConverter class. . . . . 35

4.11 Adds material data to A3X object. . . . 37

4.12 Drag and drop in JavaScript. . . . . 39

4.13 Rubber duck in A3X. . . . 40

4.14 JavaScript file created by GWT. . . . 42

4.15 Stairs in A3X (12 meshes, 8 materials). . . . 43

5.1 Cylinders in A3X (1700 meshes). . . . . 44

5.2 Performance diagram. . . . . 45

(10)

Chapter 1 Introduction

1.1 Background

COLLADA is a format used to exchange digital assets [5] (.dae files - digital asset exchange). With COLLADA you can store an entire scene with cameras, lighting, geometry, animations and much else. The .dae files can’t be used directly to render a scene or a 3D object, so you need some kind of parser that translates the .dae file into some kind of renderable file or structure.

This thesis work was done at Agency9. Agency9 are the creators of 3D Maps [4] which is a WebGL engine used for creating GIS applications. Their engine has support of importing COLLADA files which are then converted into a renderable file format. They also created CityPlanner which is a visu- alisation solution for urban planning.

Agency9 has the need of a new COLLADA parser, their current parser have a few issues that needs to be addressed and there are new requirements that needs to be met. Their current parser is created using JAXB which is an API that maps Java classes to XML representations [16] (XML is a markup language such as HTML, a COLLADA file is an XML file with a specific set of tags [23]). The data is then parsed into the A3X format (A3X is Agency9’s own renderable file format).

The problem with JAXB is that it’s a general parser and not tailored for COLLADA files. In order to parse an XML file using JAXB an XML Schema

(11)

(XSD) file is needed which describes the elements of an XML file (element and tag will be used interchangeably in this report, they mean the same thing in the context of XML). JAXB can only parse an XML file if the XSD file’s contents matches the target XML file. This means that for each different type of XML file a different XSD file is needed in order for JAXB to work.

These COLLADA files are not created by Agency9 themselves, they are given by their clients. Depending on the exporter these COLLADA files can have small differences which needs to be interpreted differently. They may also be exported using an older version of XML. JAXB will not be able to handle all these special cases correctly which is the main problem with their current parser. For this reason Agency9 wants a new parser that is tailored for COLLADA and doesn’t use JAXB. The XSD file required by JAXB is unnecessary since their parser only parse COLLADA files. By making a parser tailored for COLLADA you are able to handle all these special cases correctly and possibly more efficient. This new parser will not need an XSD file nor any configuration at all.

Currently Agency9 only have a parser on the server side so when a client needs to upload a new 3D model they need to send the file to the server which parses the file to A3X, sends the file back to the client who finally get a visual representation. Figure 1.1 displays the file chain.

Figure 1.1: File chain from client to server.

This is inconvenient and they want the parser to be able to work in the browser as well, without any interaction with their servers.

By creating a parser which can be used on both server side and client side you don’t have to send the file to the server to be parsed; the client can handle it in the browser. This will offload the server a bit as well as the client will

(12)

receive much faster feedback on the visual representation of the 3D model.

If the client is satisfied with the 3D model they can send the A3X file to the server for storage. Figure 1.2 displays the new file chain.

Figure 1.2: New file chain from client to server.

1.2 Problem Definition

The creation of an efficient and flexible parser that can be used to parse COLLADA files to A3X on both client and server in Agency9’s pipeline.

Requirements

• Flexibility

It should be able to handle special cases of data. It should be easy to modify the parser behaviour depending on parameters such as for example XML version.

• Execution speed

Should be at least as good as their current parser. There is less requirement for the parser on the client-side since the target platform is different.

• Memory consumption

Should not allocate excess nor unnecessary blocks of memory.

• Drag and drop functionality

Should work in the browser without any major issues.

(13)

1.3 Demarcation

The COLLADA files stores a lot of information such as cameras and animations. I am only interested in the geometry and materials and will therefore not parse cameras, animations and much else.

1.4 Related Work

A COLLADA parser is not uncommon and there already exists some COL- LADA parsers which are open source. The main issue with these parsers is that they usually only support export of more well-known file types such as .fbx and .obj and not .a3x since it’s Agency9’s own file format.

All these COLLADA parsers are based on an existing XML parser. There are plenty of XML parsers to choose from depending on the needs of the application. In section 3.2 these different types of XML parsers will be described in detail.

Jacolo

Jacolo is an open source COLLADA parser in Java which can parse a subset of the COLLADA tags and export to the .fbx file format [15]. It’s licensed under GNU GPL which means it can’t be used in paid software.

jaxb-collada

”jaxb-collada” is a COLLADA parser in Java which is created by using JAXB [17]. It only supports export of .obj models.

1.5 Thesis Structure

Section 2 covers the theory and proficiencies that is needed to understand this thesis. Tools and technologies will be explained. Section 3 will explain different XML parser technologies and which XML parser my system will be based on. Section 4 will describe details of implementation. Section 5 will display the final product and performance, evaluation of the results, future work and personal reflections.

(14)

Chapter 2

Theory and Proficiencies

2.1 XML

XML is a markup language similar to HTML. It is a open standard that is used to describe information in plain text which is both human-readable and machine-readable. A COLLADA file is an XML file with a defined set of tags [23]. Figure 2.1 is an example of an XML file.

1

3 <employee>

4 <id>1</id>

5 <name>Alba</name>

6 <salary>100</salary>

7 </employee>

Figure 2.1: XML example file

2.2 GWT

The client application which Agency9 hosts is created using GWT which compiles Java code and generates JavaScript that can run in the browser [13]. By using GWT you can create a parser in Java which can run on both client (compiled through GWT) and server (compiled normally). GWT is a bit restricted and does not support all the functionality in Java, so the

(15)

libraries and classes used for the client-side parser needs to be compatible with GWT. GWT also requires that all source code is accessible so if an external library is used the .class files isn’t enough, the .java files is needed.

2.2.1 Compatibility

To compile Java through GWT the code needs to be compatible with a specific java version depending on the version of the GWT SDK. For example, GWT 2.0 supports generics whereas GWT 1.4 does not. Following is a summary of Java language support in GWT:

• ”Primitive types (boolean, byte, char, short, int, long, float, and double), Object, String, arrays, user-defined classes, etc. are all supported, with a couple of caveats.” [11]

• ”Exceptions: try, catch, finally and user-defined exceptions are supported as normal, although Throwable.getStackTrace() is not meaningfully supported in production mode. Several fundamental exceptions implicitly produced by the Java VM, most notably

NullPointerException, StackOverflowError, and OutOfMemory- Error, do not occur in production mode as such. Instead, a JavaScriptException is produced for any implicitly generated exceptions. This is because the nature of the underlying JavaScript exception cannot be reliably mapped onto the appropriate Java exception type.”

[11]

• ”Multithreading and synchronization: JavaScript interpreters are single-threaded, so while GWT silently accepts the synchronized keyword, it has no real effect. Synchronization-related library methods are not available, including Object.wait(), Object.notify(), and Object.notifyAll(). The compiler will ignore the synchronized keyword but will refuse to compile your code if the Object’s related synchronization methods are invoked.” [11]

2.2.2 JSNI

”Often, you will need to integrate GWT with existing handwritten JavaScript or with a third-party JavaScript library. Occasionally you may need to access low-level browser functionality not exposed by the GWT class API’s. The

(16)

JavaScript Native Interface (JSNI) feature of GWT can solve both of these problems by allowing you to integrate JavaScript directly into your application’s Java source code.” [12]

All classes from the JDK is not available in GWT. Sometimes there is an alternative that can be used in JavaScript, you can then use JSNI in order to access this JavaScript object in Java. Figure 2.2 is an example of how to code a JSNI method that puts up a JavaScript alert dialog.

1 public static native void alert(String msg) /*-{

2 $wnd.alert(msg);

3 }-*/;

Figure 2.2: JSNI example source code

2.3 COLLADA

COLLADA is a well known file format used for exchanging digital assets.

COLLADA defines an open standard XML Schema which can be used to exchange digital assets among different types of software applications which might otherwise store their assets in incompatible file formats. COLLADA files are XML files which can be interpreted by any XML parser. A COL- LADA file can contain visual scenes including geometry, shaders, effects, physics, animation and much more [5].

The COLLADA files that needs parsing will mainly consist of buildings or whole cities (which itself consists of several buildings).

2.3.1 Examples of Tags

There is a lot of different tags in a COLLADA file. Following is a subset of the tags that needs to be parsed. These tags and descriptions can be found in the COLLADA specification [6].

”The <geometry> element categorizes the declaration of geometric information. Geometry is a branch of mathematics that deals with the measurement, properties, and relationships of points, lines, angles, surfaces, and solids.

(17)

The <geometry> element contains a declaration of a mesh, convex mesh, or spline.” [6].

Each COLLADA file may consist of several <geometry> tags. Each geometry tag consists of one and only one <mesh> tag. Which means that the

<geometry> tag can be treated as a mesh since the <geometry> tag only contains an identifier and the <mesh> tag.

<mesh>

”Meshes embody a general form of geometric description that primarily includes vertex and primitive information.

Vertex information is the set of attributes associated with a point on the surface of the mesh. Each vertex includes data for attributes such as:

• Position

• Normal

• Texture coordinate

The mesh also includes a description of how the vertices are organized to form the geometric shape of the mesh. The mesh vertices are collated into geometric primitives such as polygons, triangles, or lines.” [6]

The <mesh> tag will contain all the geometry that represents the mesh.

”The <float array> element stores the data values for generic use within the COLLADA schema. The arrays themselves are strongly typed but without semantics. They simply describe a sequence of floating point values. The data from this element can be positions, normals, texture coordinates etc.”

[6]

The <float array> tag is a child tag of the <mesh> tag.

”The <effect> element contains data for materials. A material is a set of values which can represent for example ambient, emission and specular color (lighting properties). It can also contain a texture used to texture a 3d-object using texture coordinates.” [6]

The <effect> tag is needed to connect materials to meshes.

(18)

2.4 A3X

The target format for this parser is the A3X file format, which is a renderable file format created by Agency9. A3X consists of a set of meshes which each consists of positions, normals, texture coordinates, triangles and materials among others. The A3X files are binary files so the data needs to be stored in byte arrays in order to work. Since COLLADA files are a text format the data read from COLLADA needs to be converted before inserted into an A3X object.

Agency9 has created a A3X library in Java which will be used. Since the functionality already exists there is no point in creating a new library. Cre- ating a new A3X library could be a thesis by itself.

When the data from COLLADA has been parsed and stored an A3X object will be created using the A3X library and add the converted data from COLLADA to the A3X object.

(19)

Chapter 3 Method

This project will be created from scratch. However it will use an existing XML parser library. In this section different types of XML parsers will be described with their respective advantages and disadvantages. A requirement for these XML parsers is that they are open source since GWT needs the source code in order to compile and generate JavaScript. The license is also important since the parser may be used in a commercial product in a later stage. The requirements specified in section 1.2 will also be considered.

3.1 XML Parser Types

3.1.1 DOM Parsers

DOM parsers defines the way a document can be accessed and manipulated.

DOM parsers creates a tree structure of a whole XML document and stores it in memory. This tree structure can then be navigated and elements can be added, modified, retrieved and deleted [7].

The disadvantages with this document structure is that the whole document needs to be kept in memory which may not be possible in a browser since the XML file could be arbitrary large. The application Agency9 hosts can be used by smartphones and tablets which have a more limited memory.

DOM is much slower than JAX and SAX according to various tests [8].

This parser type can’t be applied to the current project due to the memory and speed requirements.

(20)

3.1.2 Event Parsers

Event parsers works different compared to DOM parsers. DOM operates on the document as a whole while an event parser operates on each piece of the XML document sequentially. An event parser reports parsing event such as the start of an element, characters from an element and end of an element [9]. This has the advantages of a lesser memory footprint since no internal tree structure needs to be maintained. This also allows for faster access times than DOM parsers [8]. A disadvantage with event parsers is that after the document has been parsed the data can’t be accessed again unless you read the whole document again, however this is not a problem for the parsing of COLLADA files since they only have to be accessed once.

This type of parser will be suitable for this project. There is essentially two types of event-parsers; SAX and StAX.

SAX

SAX ”pushes” the information read from XML to the application. The application itself does not have any control over the parser. Essentially your application creates callback functions which the SAX parser calls with no impact from the application. The SAX parser reports all events that happens when parsing the XML file. In this way the application has no control over the parser and needs to maintain state variables in order to know where in the document the parser resides [18]. When the SAX parser starts it iterates through the whole document and calls the callback functions whenever a parsing event occurs.

A SAX parser could be viable for this project however since it’s a push parser the state management could end up as a bottleneck.

StAX

StAX works a bit different than SAX. It’s actually the opposite; In StAX the application controls the parser by using a cursor that points to some location in the document. Since the application controls the cursor the application decides when to parse the next piece of the XML document. In other words the application ”pulls” the data from the parser. The application will know where in the document it’s located and doesn’t need to maintain any state variables [20].

(21)

StAX is the preferred API to use for this project. A pull parser with high speed that doesn’t leave a large memory footprint. No state management is needed since the applications controls the cursor that represents the XML document.

3.2 XML Parser Implementation

3.2.1 SJSXP

”JSR 173 defines a new Streaming API for XML (StAX). The Sun Java Streaming XML Parser (SJSXP) is an efficient implementation of the StAX API which is fully compliant with the XML 1.0 and Namespace 1.0 specifi- cations” [19].

This is an XML parser library created by Sun in 2010. However this library is licensed under GPL v2 which prevents the library from being used in this project since the license prevents open software to becoming paid software.

3.2.2 Woodstox

”Woodstox is a full-featured high-performance Open Souce Java XML processor. It implements Streaming XML API, Stax (JSR-173, javax.xml.stream), and is available under 2 Open Source licenses (Apache License, LGPL)” [21].

Woodstox is definitely a candidate to be used as XML parser library. There are several benchmarking tests comparing Woodstox to different libraries.

Woodstox outperforms SJSXP in several tests [22]. Since it’s licensed under LGPL/Apache License it can be used in paid software.

3.2.3 Aalto

”The Aalto XML processor is a next-generation StAX XML processor implementation. It is not directly related to other existing mature implementations (such as Sun Java Streaming XML Parser), although it did come about as a prototype for evaluating implementation strategies that differ from those traditionally used for Java-based parsers” [1].

Aalto is licensed under Apache License 2.0 and can be used in paid software.

There are several benchmarking tests where Aalto is a clear winner. Aalto

(22)

performed 30-40% faster than SJSXP [2] and it came out on top on several different charts [3].

There is no way of knowing how Woodstox and Aalto performs when parsing a COLLADA file. COLLADA files are special because most of the data consists of float arrays. Specific benchmarking tests are needed in order to determine which one is faster. Since there is no functionality for parsing a COLLADA file yet it is hard to determine this.

Since both Woodstox and Aalto implements the StAX interface they can easily be changed. Aalto was selected as parser library because of overall good results from various benchmarking tests.

(23)

Chapter 4

Implementation

4.1 System Design

This design will depend on how Aalto is implemented. A wrapper will be created for Aalto which will be used in the system in some base class. Figure 4.1 illustrates my state diagram for the parser system.

Figure 4.1: State diagram.

(24)

The system will consist of a set of parsers which will be identified by the XML tag read from the COLLADA document. The data read from COLLADA will be stored in a data container and can then be used to create an A3X file and convert the data to A3X. This is a very rough design and there isn’t any point in making it any more detailed. This design is just to make an approximate picture of how the system would look which is important for the implementation.

4.2 Aalto

There is essentially two versions of the Aalto library; a synchronous version and an asynchronous version.

4.2.1 Synchronous Version

The synchronous version of Aalto wasn’t much trouble setting up. It uses a cursor which iterates through the document step by step like any other StAX library. Figure 4.2 shows how a synchronous loop is implemented in Java.

This code is identical to the actual implementation.

(25)

1 public void runAalto(String fileName) {

2 try {

3 FileInputStream stream = new FileInputStream(fileName);

4 streamReader = (StreamReaderImpl) inputFactory.

5 createXMLStreamReader(stream);

6

7 while (streamReader.hasNext()) {

8 int type = streamReader.next();

9 switch (type) {

10 case XMLStreamConstants.START_ELEMENT:

11 // start tag

12 break;

13 case XMLStreamConstants.CHARACTERS:

14 // character data

15 break;

16 case XMLStreamConstants.END_ELEMENT:

17 // end tag

18 break;

19 default:

20 break;

21 }

22 }

23 stream.close();

24 streamReader.close();

25 }

26 catch (Exception e) {

27 e.printStackTrace();

28 }

29 }

Figure 4.2: Aalto synchronous code example.

4.2.2 Asynchronous Version

The asynchronous version of Aalto is a bit different from the synchronous version. Instead of reading from the XML file it is fed a buffer of bytes which represents the whole file which it then processes. This makes it possible to

(26)

parse a document in different parts at different times. This could be necessary for Aalto to work in conjunction with GWT in the browser. Figure 4.3 shows how an asynchronous loop in Aalto is implemented. This code is identical to the actual implementation.

(27)

1 private void runAalto(byte[] bytes) { 2 int xmlEvent = 0;

3 int byteArrayIndex = 0;

4

5 int feedSize = FEED_SIZE;

6 try {

7 while (byteArrayIndex < bytes.length) {

8 inputFeeder.feedInput(bytes, byteArrayIndex, feedSize);

9 // adjust the byte array index and feed size

10 byteArrayIndex += feedSize;

11 int nextFeedSize = bytes.length - byteArrayIndex;

12 if (nextFeedSize < feedSize) {

13 feedSize = nextFeedSize;

14 }

15 do {

16 xmlEvent = streamReader.next();

17 switch (xmlEvent) {

18 case XMLStreamConstants.START_ELEMENT:

19 onStartElement(streamReader);

20 break;

21 case XMLStreamConstants.CHARACTERS:

22 onCharacters(streamReader);

23 break;

24 case XMLStreamConstants.END_ELEMENT:

25 onEndElement(streamReader);

26 break;

27 }

28 } while (xmlEvent != XMLStreamConstants.END_DOCUMENT);

29 } catch (Exception exception) {

31 }

32 }

33 }

Figure 4.3: Aalto asynchronous code example.

(28)

4.2.3 Client-side

For the client-side parser the asynchronous version of Aalto will be used since it could be necessary in order for the parser to work in the browser.

Since JavaScript is single-threaded locking the thread should be avoided and that’s why this separation could be needed. It’s also really easy to swap between these version so if it turns out the asynchronous version of Aalto is unnecessary it can easily be swapped out to the synchronous version of Aalto.

Aalto is using a lot of different standard libraries which isn’t supported in GWT. These libraries and their references had to be stripped from the Aalto source code in order for Aalto to work with GWT. Fortunately the cursor object which iterates through the document isn’t dependent on all these libraries, so they could be stripped without removing needed functionality.

This is however only for the client-side version since the server doesn’t use GWT and runs plain java.

One limitation with GWT is that there is no way to access a file on a local filesystem without any interaction with a server. Since a drag and drop functionality is needed this problem must be solved. In JavaScript there are objects that can be used for loading files locally. By using JSNI you can create a FileReader object which is a native JavaScript object that can read files locally. This is the workaround solution for this limitation in GWT. Without JSNI this wouldn’t be possible because of the restrictions of GWT.

When loading the file using the FileReader object you get an ArrayBuffer as result, which is a byte buffer. Aalto’s asynchronous version is fed a byte array (byte[]) so there is a conversion needed from the type ArrayBuffer to byte[]. Instead of creating a for loop and looping through each byte, the source code of Aalto can be modified to support the type ArrayBuffer.

This will give better performance since you don’t have to loop through the whole array.

4.2.4 Server-side

The synchronous version of Aalto will be used for the server-side parser since there isn’t any reason to use an asynchronous version. The asynchronous version can be used for separating loading from parsing which will not be an issue

(29)

on the server-side since the server isn’t single-threaded unlike JavaScript.

4.3 Collada Parser

The main parser object is an abstract class which is extended by two different implementations; client and server. The difference is that the client parser compiles with GWT and uses the asynchronous version of Aalto while the server parser compiles with Java and uses the synchronous version of Aalto.

This means that there will be two different applications that will be tested, this is only due to the limitation of GWT by not supporting all the standard libraries in the JDK. By having this structure you are able to use the same parser for both server and client (since they extend the main parser class ColladaParser). Figure 4.4 illustrates this relation.

Figure 4.4: Parser class diagram.

The ColladaParser is an abstract class which will contain functionality to parse the data that is read from COLLADA documents. Since both the server and client extends these class they will both share the functionality while having different Aalto implementations.

4.3.1 Parsing COLLADA Tags

There is a lot of different tags that needs to be read and stored in a container.

The implementation needs to flexible in a sense that it should be easy to change what kind of tags that needs to be read and how information is stored. My supervisor pointed out that an effective way could be to store tag parsers in a hash map. So whenever a start tag is read, you check the hash map for an entry and if a parser is found, run the parser and send the related object as argument. In this way you can create a parser object for

(30)

each tag that needs parsing. In order to avoid name collision (for example,

<input> can reside under both <float array> and <triangles>) the name needs to be distinctive so you need to keep the parent namespaces when traversing tags. A hashmap also has fast average lookup time (O(1)) [14]

which will be needed since there are a lot of different tags that needs to be parsed. Figure 4.5 represents the implementation of the interface which the tag parsers will implement.

1 public interface TagParser {

2 void onStartElement(StreamReader reader, Object arg);

3 void onCharacters(StreamReader reader, Object arg);

4 void onEndElement(StreamReader reader, Object arg);

5 }

6

7 public class Example implements TagParser {

8 public void onStartElement(StreamReader reader, Object arg) { 9 System.out.println(”<” + reader.getLocalName() + ”>”);

10 }

11

12 public void onCharacters(StreamReader reader, Object arg) { 13 System.out.println(reader.getText());

14 }

15

16 public void onEndElement(StreamReader reader, Object arg) { 17 System.out.println(”</” + reader.getLocalName() + ”>”);

18 }

19 }

Figure 4.5: TagParser interface

If there is a new tag you want to parse you implement the TagParser interface. Instead of having one large object which handles all types of parsers you have a set of smaller parsers where each has its specific task. In this way whenever one of these parser is running you know exactly where in the document you are and you know precisely what kind of data that is needed to be read. There are other solutions for this problem such as comparing strings which haven’t been tested but in theory a hashmap would be the

(31)

fastest choice and this is a typical test that can be done when the solution works to see if the hashmap actually is a bottleneck or not. If the hashmap would end up as a bottleneck this solution could be redesigned. Figure 4.6 is an example when a TagParser object is used in the class ColladaParser:

1 public class ColladaParser {

2 // ... omitting declarations and other stuff 3

4 public void onStartElement(XMLStreamReader2 reader) { 5 TagParser parser = parsers.get(reader.getLocalName());

6 if (parser != null) {

7 parser.onStartElement(reader, obj);

8 }

9 }

10

11 public void onCharacters(XMLStreamReader2 reader) {

12 TagParser parser = parsers.get(reader.getLocalName());

14 parser.onCharacters(reader, obj);

15 }

16 }

17

18 public void onEndElement(XMLStreamReader2 reader) {

19 TagParser parser = parsers.get(reader.getLocalName());

21 parser.onEndElement(reader, obj);

22 }

23 }

24 }

Figure 4.6: Hashmap used in ColladaParser.

When a start tag is read in the COLLADA document you read the tag name and do a lookup in a hash map. If a parser is found it executes, otherwise you just ignore the tag. These methods onStartElement, onCharacters and onEndElement in Figure 4.6 is called from the Aalto implementation.

(32)

4.3.2 Main Tags

The COLLADA files consists of a lot of different tags and the ones that contains the most data is the <float array> and tags. The <float array>

tag contains positions, normals and texture coordinates. The tag contains triangles that links together the data from <float array> and form a collection of flat surfaces. There is also several tags that together makes up materials. A material is used to describe the lighting properties and texture of a mesh.

Parsing <float array>

This data is read by reading the whole element as a double array. double is needed since the geographical positions used in 3D Maps can often be outside the span of a 32-bit float. This array is then stored in a Source object which contains the id of the data. This id is needed in order to connect the indices from the tag. Figure 4.7 shows the implementation of a private method in the <float array> parser which is used to read the float arrays.

(33)

1 private void readElement(StreamReader2 reader, int count, Mesh mesh) { 2 double[] buffer = new double[count];

3 try {

4 int byteLength = count - offset;

5 int bytesRead = reader.readElementAsDoubleArray(

6 buffer, // data buffer

7 0, // byte offset

8 byteLength); // byte length

9

10 if (bytesRead == byteLength) {

11 // read complete

12 mesh.getSource().setArray(buffer);

13 } else {

14 // read incomplete, store state variables

15 // ...

16 }

17 } catch (Exception e) {

19 }

20 }

Figure 4.7: <float array> parser Parsing

The tag is a different story. The tag consists of a large int array of triangles which connects the data from <float array> and creates flat surfaces. Each triangle consists of 3 vertices where each vertex consists of a position, normal and texture coordinate. The triangles in COLLADA is arranged in a way that the properties of the vertex can be in any order and is instead referenced by an index and an offset. This is a problem because the A3X file format’s index buffer doesn’t support different offsets. Figure 4.8 illustrates this problem.

(34)

Figure 4.8: Triangle indexing.

In order for this to work the indices needs to be rearranged in the correct order. There is also the problem with COLLADA that multiple vertices are identical hence there are unnecessary geometry which needs to be filtered for performance. This can be done using an algorithm given by my supervisor which they are currently using in their JAXB parser. Unfortunately there is no reference to this algorithm since he made it up himself. Figure 4.9 is the pseudocode for the algorithm (pseudocode is code without specific syntax, it is often used to describe algorithms).

(35)

1 declaration:

2 vertex {

3 index : int

4 position : double3

5 normal : double3

6 texCoord : double2

7

8 int hashCode() {...}

9 bool equals(Object obj) {...}

10 }

11 list<vertex> vertices // contains unique vertices 12 list<int> indices // index list

13 int index = 0 // current vertex index 14 end

15

16 function(buffer):

17 while (buffer) :

18 create vertex

19 vertex.position = buffer.getPosition() 20 vertex.normal = buffer.getNormal() 21 vertex.texCoord = buffer.getTexCoord() 22

23 vertex result = hashmap.get(vertex)

24 if (result == null):

25 // vertex doesn’t exist. Add the vertex

26 hashmap.put(vertex, vertex)

27 // update vertex index and lists

28 vertex.index = index++

29 vertices.add(vertex)

30 indices.add(vertex.index)

31 else:

32 // vertex already exists

33 indices.add(result.index)

34 end

Figure 4.9: Pseudocode for converting triangles from COLLADA to A3X.

(36)

The algorithm makes use of a hashmap which is used for determining if a vertex is unique or not. In each loop a new vertex is constructed using the indices read from the array of triangles. This vertex is then looked up in a hashmap. It if exists the vertex is unnecessary and the vertex will instead be referenced by the index of the existing vertex. This will make sure that each vertex is unique. If the vertex doesn’t exist it will be added to the hashmap and a vertex list. The new index for this vertex will be added to an index list.

When the whole buffer has been read we have a list that consists of unique vertices and an index list which references to the vertices in a correct order.

The purpose of the hashmap is to identify unique vertices and is no longer needed.

There isn’t any source code for a similar algorithm or different solution which deals with this problem. Since this has been a problem for Agency9 in their current JAXB parser there wasn’t any need to look into different solutions since this works and is efficient. The pesudocode was translated into Java code and looks similar.

Parsing Materials

Materials in COLLADA is a bit tricky. There are several tags that contains data for materials which in turn is referenced by another tag, which in turn is referenced by an other tag and so on.

The materials can be used by different meshes. When creating a mesh the id will be stored in a hashmap which can then be used to retrieve the index of the mesh. When creating a material this hashmap can be used for lookup since the material will have a reference to a mesh id.

4.3.3 Other tags

There are several tags that have to be parsed and in section 4.3.2 only a subset of them were presented. Other tags that has to be parsed is listed here. This is a list of the necessary tags which are needed in order for the COLLADA file to be parsed correctly. There are other optional tags that is not mentioned.

• <asset.unit>

Unit of measurement. Used to scale the model to correct proportions.

(37)

• <asset.up axis>

The up axis of model. z axis is the default however the x and y can also be used. This determines how the triangles are arranged.

• <library images.image>

Id of an image.

• <library images.image.init from>

Source path of an image.

• <library visual scenes.visual scene.node.matrix>

Matrix transformation specific for a mesh. A Matrix transformation can move/rotate/scale a mesh.

• <library visual scenes.visual scene.node.instance geometry.

bind material.technique common.instance material>

Binds a material to a mesh.

4.4 A3X Converter

The A3X converter will make use of the existing A3X library created by Agency9. It will essentially be two different implementations; one for the server-side and one for the client-side, this is for the same reason as the ColladaParser object, for compatibility with GWT. Figure 4.10 is the implementation of the base class.

(38)

1 public abstract class A3XConverter { 2 private A3X a3x;

3

4 public A3XConverter(A3X a3x) {

5 this.a3x = a3x;

6 }

7

8 protected A3X getA3X() {

9 return a3x;

10 }

11

12 public abstract void convert(Mesh mesh);

13 public abstract void convert(Material material);

14 public abstract void convert(Asset asset);

15 // ...

16

17 public abstract void write(String path);

18 }

Figure 4.10: A3XConverter class.

In this design you can create two classes which extends the A3XConverter which can then be implemented using different methods from the A3X library.

In the example in Figure 4.10 the Mesh, Material and Asset objects are created when parsing the COLLADA document and then used to convert and add the data to an A3X object.

When a mesh object has been parsed completely it can be converted directly, in this way you don’t have store multiple meshes at the same time. You store one mesh and add the data to the A3X file and then throw away the mesh. In this way the parser has the potential to become a bit more memory efficient by not holding on every bit of data that is read from the COLLADA document. Because of the nature of the Garbage Collector in Java this might actually not have any affect at all [10].

(39)

4.4.1 Client-side

The client-side converter is similar to the server-side converter. There are some objects which are not included in the GWT library that is used in the server-side converter and these parts need to be implemented differently.

When these parts are implemented it should work exactly like the server-side converter.

The A3X library Agency9 is using has a desktop version which can be used for the server-side. For the client-side you can’t use for example java.nio.ByteBuffer which is extensively used in the desktop version. The workaround for this problem was to create a new ByteBuffer class with identical interface as the original but different implementation. In the A3X library there exist an abstract implementation of a byte buffer with support for GWT. By extending this class and defining the implementation using your own byte[] object internally this was possible. Since the A3X library already had implemented the base functionality there was no need to create a new one.

4.4.2 Server-side

For the server-side the desktop version of the A3X library was used which consists of a lot of useful objects to add data to an A3X file. When all data has been read and stored from the COLLADA document you can use these objects and methods to create A3X versions of the COLLADA data and add the data to an A3X object. Figure 4.11 is an implementation of the method convert(Material).

(40)

1 public void convert(collada.Material colladaMaterial, int mesh) { 2 InstanceGroup instanceGroup = instances.getInstanceGroup(mesh);

3 Instance instance = instanceGroup.getInstance();

4 Material a3xMaterial = new agency9.Material();

5

6 Effect effect = colladaMaterial.getEffect();

7 if (effect != null) {

8 double[] p = null;

9 p = effect.getAmbient();

10 if (p != null) {

11 a3xMaterial.setColor(MaterialType.AMBIENT,

12 p[0], p[1], p[2], p[3]);

13 }

14 p = effect.getEmission();

15 if (p != null) {

16 a3xMaterial.setColor(MaterialType.EMISSION,

17 p[0], p[1], p[2], p[3]);

18 }

19 p = effect.getSpecular();

20 if (p != null) {

21 a3xMaterial.setColor(MaterialType.SPECULAR,

22 p[0], p[1], p[2], p[3]);

23 }

24 p = effect.getDiffuse();

25 if (p != null) {

26 a3xMaterial.setColor(MaterialType.DIFFUSE,

27 p[0], p[1], p[2], p[3]);

28 }

29 a3xMaterial.setValue(

30 MaterialType.SHININESS, effect.getShininess());

31 }

32 // write texture ...

33

34 instance.setMaterial(a3xMaterial);

35 }

Figure 4.11: Adds material data to A3X object.

(41)

4.5 Drag and Drop Functionality

One of the goals was to add a drag and drop functionality in the browser.

With GWT you can create JavaScript files and integrate them into your GWT project. By creating a drag and drop functionality in JavaScript and using the objects created in Java you can make this work. There is actually no support for drag and drop in GWT using the standard libraries and that’s why JavaScript is needed in order to make it work.

With GWT you can export your Java classes and use them in JavaScript. By creating the Java objects in JavaScript and combining it with the drag and drop functionality included in JavaScript libraries you can create a callback function that responds to the dropped files, and parse them depending on the file type.

GWT has support for integrating JavaScript files into the compilation and are then included in the final JavaScript file. Figure 4.12 demonstrates implementation of a drag and drop functionality and the callback function which starts the actual parsing of a COLLADA file.

(42)

1 function initColladaParser() {

2 parser = new collada.ColladaParserGWT();

3

4 reader = new FileReader();

5 reader.onload = function(event) {

6 /* Blob is loaded, run the parser (async aalto) */

7 parser.run(this.result);

8

9 /* Load the resulting A3X blob in 3D Maps. */

10 var modelReader = new Agency9.Model.ModelReader();

11 modelReader.read(window.blob, load(0));

12 }

13

14 /* Checks if FileReader is supported */

15 if (typeof window.FileReader === ’undefined’) { 16 console.log(”File API & FileReader unavaliable”);

17 } else {

18 /* Prevent default drag n drop behaviour */

19 document.addEventListener(”dragenter”, function(event) {

20 event.preventDefault();

21 });

22 document.addEventListener(”dragover”, function(event) {

24 });

25 document.addEventListener(”dragleave”, function(event) {

27 });

28

29 /* Read the file dropped */

30 document.addEventListener(”drop”, function(event) {

32 reader.readAsArrayBuffer(file);

33 });

34 }

35 }

Figure 4.12: Drag and drop in JavaScript.