• No results found

Protein Visualization and Haptics

N/A
N/A
Protected

Academic year: 2021

Share "Protein Visualization and Haptics"

Copied!
44
0
0

Loading.... (view fulltext now)

Full text

(1)LiU-ITN-TEK-A--08/066--SE. Protein Visualization and Haptics Joel Nises 2008-05-15. Department of Science and Technology Linköping University SE-601 74 Norrköping, Sweden. Institutionen för teknik och naturvetenskap Linköpings Universitet 601 74 Norrköping.

(2) LiU-ITN-TEK-A--08/066--SE. Protein Visualization and Haptics Examensarbete utfört i vetenskaplig visualisering vid Tekniska Högskolan vid Linköpings universitet. Joel Nises Handledare Petter Bivall Persson Examinator Anders Ynnerman Norrköping 2008-05-15.

(3) Upphovsrätt Detta dokument hålls tillgängligt på Internet – eller dess framtida ersättare – under en längre tid från publiceringsdatum under förutsättning att inga extraordinära omständigheter uppstår. Tillgång till dokumentet innebär tillstånd för var och en att läsa, ladda ner, skriva ut enstaka kopior för enskilt bruk och att använda det oförändrat för ickekommersiell forskning och för undervisning. Överföring av upphovsrätten vid en senare tidpunkt kan inte upphäva detta tillstånd. All annan användning av dokumentet kräver upphovsmannens medgivande. För att garantera äktheten, säkerheten och tillgängligheten finns det lösningar av teknisk och administrativ art. Upphovsmannens ideella rätt innefattar rätt att bli nämnd som upphovsman i den omfattning som god sed kräver vid användning av dokumentet på ovan beskrivna sätt samt skydd mot att dokumentet ändras eller presenteras i sådan form eller i sådant sammanhang som är kränkande för upphovsmannens litterära eller konstnärliga anseende eller egenart. För ytterligare information om Linköping University Electronic Press se förlagets hemsida http://www.ep.liu.se/ Copyright The publishers will keep this document online on the Internet - or its possible replacement - for a considerable time from the date of publication barring exceptional circumstances. The online availability of the document implies a permanent permission for anyone to read, to download, to print out single copies for your own use and to use it unchanged for any non-commercial research and educational purpose. Subsequent transfers of copyright cannot revoke this permission. All other uses of the document are conditional on the consent of the copyright owner. The publisher has taken technical and administrative measures to assure authenticity, security and accessibility. According to intellectual property law the author has the right to be mentioned when his/her work is accessed as described above and to be protected against infringement. For additional information about the Linköping University Electronic Press and its procedures for publication and for assurance of document integrity, please refer to its WWW home page: http://www.ep.liu.se/. © Joel Nises.

(4)

(5) Abstract The Chemical Force Feedback (CFF) visual and haptic molecule rendering application, being developed at Linköping University to evaluate the use of haptics as a teaching tool for protein-ligand docking in molecular life science requires several additional features to function as a mature protein visualization tool. Previous developments of the application have focused mostly on the haptic part of the program, leaving the visual representation somewhat under-developed. This thesis project has implemented various features in order to both improve the immediate functionality of the application as well as lay down the foundation for future additions. The main contributions of this thesis include: Firstly, integration of the DSSP algorithm has provided the application with secondary structure information for the protein visualization. Secondly, a new rendering system has provided support for rendering transparent surfaces in a correct way as well as simplifying future addition of new visual representations of molecules. Thirdly, a file-selection dialog have been implemented using the haptic scenegraph as a step toward the goal of making the user interface of the application more user-friendly.. Sammanfattning Applikationen Chemical Force Feedback (CFF) för haptisk och visuell molekylrendering som har utvecklats på Linköpings Universitet för att testa nyttan av haptik för undervisning av protein-ligand dockning för molekylär livsvetenskap behöver förbättras på ett antal punkter för att bättre kunna fungera som ett komplett molekylvisualiserings-verktyg. Tidigare projekt som utvecklat applikationen har fokuserat mestadels på den haptiska delen av programmet, vilket gjort att den visuella aspekten kommit efter. Det här examensarbetet har implementerat diverse ny funktionalitet både för att förbättra direkt kännbara aspekter av programmet, samt att lägga grunden för framtida utökningar. De huvudsakliga förbättringarna som det här exjobbet resulterat i inkluderar: För det första, integration av DSSP-algoritmen i programmet har gjort information om sekundärstrukturen hos protein tillgänglig för visualisering. v.

(6) vi För det andra, ett nytt renderings-system tillåter rendering av semitransparenta ytor på ett korrekt sätt samtidigt som det skapar en program-struktur som bättre lämpar sig för implementation av nya molekylrepresentationer. För det tredje, en filvals-komponent som fungerar i den haptiska scengrafen har designats för att göra programmets användargränssnitt mer tillgängligt..

(7) Acknowledgments I would like to thank Petter Bivall Persson, Gunnar Höst, Anders Ynnerman, Karljohan E. Lundin Palmerius, Willem Frishert, Johan Lindstrand and Ruman Zakaria for their assistance and help with my thesis.. vii.

(8)

(9) Contents 1 Introduction 1.1 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 3 3. 2 Background 2.1 Haptics . . . . . . . . 2.2 Protein-ligand docking 2.3 Secondary structure . 2.4 Computer graphics . . 2.5 Molecule rendering . .. 5 5 5 7 8 8. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. . . . . .. 3 Previous work. 11. 4 Problem description. 13. 5 Overview of methods 5.1 Secondary structure . . . . . 5.1.1 DSSP . . . . . . . . . 5.2 Spatial indexing . . . . . . . . 5.2.1 Uniform grid . . . . . 5.2.2 Octree . . . . . . . . . 5.3 Transparency . . . . . . . . . 5.3.1 Painter’s algorithm . . 5.3.2 Depth-peeling . . . . . 5.4 Filtering . . . . . . . . . . . . 5.4.1 Per object filtering . . 5.4.2 Per fragment filtering. . . . . . . . . . . .. . . . . . . . . . . .. . . . . . . . . . . .. . . . . . . . . . . .. . . . . . . . . . . .. . . . . . . . . . . .. . . . . . . . . . . .. . . . . . . . . . . .. . . . . . . . . . . .. . . . . . . . . . . .. . . . . . . . . . . .. . . . . . . . . . . .. . . . . . . . . . . .. . . . . . . . . . . .. . . . . . . . . . . .. . . . . . . . . . . .. . . . . . . . . . . .. . . . . . . . . . . .. 15 15 15 15 16 16 16 17 17 18 18 18. 6 Implementation 6.1 Determining the secondary structure 6.2 Rendering . . . . . . . . . . . . . . . 6.2.1 Transparency . . . . . . . . . 6.2.2 Filtering . . . . . . . . . . . . 6.2.3 Rendering-modes . . . . . . . 6.3 User interface . . . . . . . . . . . . . 6.4 Spatial index . . . . . . . . . . . . .. . . . . . . .. . . . . . . .. . . . . . . .. . . . . . . .. . . . . . . .. . . . . . . .. . . . . . . .. . . . . . . .. . . . . . . .. . . . . . . .. . . . . . . .. . . . . . . .. . . . . . . .. . . . . . . .. . . . . . . .. . . . . . . .. . . . . . . .. 21 21 22 22 23 25 25 26. . . . . . . . . . . .. . . . . . . . . . . .. ix. . . . . . . . . . . ..

(10) x. Contents. 7 Results. 29. 8 Future work. 31. Bibliography. 33.

(11) List of Figures 2.1 2.2 2.3 2.4 2.5. The PHANTOM Desktop haptic device . . . . . . . . . . . . Ligand and protein . . . . . . . . . . . . . . . . . . . . . . . . The main secondary structure-types . . . . . . . . . . . . . . Different rendering-modes all colored by secondary structure . Ribbon representation . . . . . . . . . . . . . . . . . . . . . .. . . . . .. . . . . .. . . . . .. 6 6 7 9 9. 5.1 5.2. Spatial indexing structures. . . . . . . . . . . . . . . . . . . . . . . Intersecting surfaces. . . . . . . . . . . . . . . . . . . . . . . . . . .. 16 17. 6.1 6.2. Current filter structure. . . . . . . . . . . . . . . . . . . . . . . . . The file-selection dialog. . . . . . . . . . . . . . . . . . . . . . . . .. 24 26. 7.1. Rendering of transparent atoms. . . . . . . . . . . . . . . . . . . .. 29.

(12)

(13) Chapter 1. Introduction The purpose of this thesis is to investigate and implement methods for improving the Chemical Force Feedback (CFF) application. CFF is a visual and haptic system developed at Linköping University dealing with protein-ligand docking. Protein-ligand docking is the process where a small molecule binds to a large protein molecule. This thesis is the continuation of work that started with a thesis [1] and was further developed as part of a research project. The application that was developed has been used in papers [2][3] discussing the use of haptics as a teaching tool for molecular life science. A set of necessary improvements determined by an interdisciplinary group consisting of members from ITN/Division for Visual Information Technology and Applications (VITA), IKE/Division of Cell Biology and IFM/Division of Chemistry Molecular biotechnology, all from Linköping University, provides the basis for the work presented in this thesis.. 1.1. Outline. The background chapter introduces some terminology as well as a basic introduction to various concepts needed to understand the rest of the report. Previous work describes the previous system, which this thesis is extending. Problem description details the specific problems addressed by this thesis, as well as some other considerations that influence the design of the system. The overview of methods chapter describes some general theory as well as the different techniques that were considered for the implementation of the application. Implementation discusses the actual implementation of the project. The choice of techniques among the ones described in the overview of methods chapter are explained. Results states the result of the thesis, which improvements were made. Future work discusses the possible future improvements of the application.. 3.

(14)

(15) Chapter 2. Background 2.1. Haptics. The word haptic comes from Greek and means approximately “touch”. It is usually used to describe the study of the human sensory modality of touch. In this report haptic refers to haptic feedback which is the use of technology to simulate the sensation of touching a virtual object as if it were real. In virtual reality a common way of implementing haptic interaction is by letting the user hold on to a stylus which is controlled by a motorized arm such as the one shown in figure 2.1. The arm allows the pen to move freely in space within the volume restricted by the length of the arm and its angular constraints. The motors can apply various forces to the pen and thus provide sensory feedback such as pressure and texture of an object. While 30 frames per second is enough to fool the eye into perceiving a set of images as a fluid motion the update rate for a haptics device needs to be much higher. Approximately 1000 Hz is considered enough to make the haptic feedback accurate. At a lower update rate a hard material would feel more like rubber. The high update rate requires very quick calculations of the feedback forces, making it hard to implement very complex models.. 2.2. Protein-ligand docking. In the application there are generally two different molecules present at any one time, as shown in figure 2.2. The protein which is the larger of the molecules and the ligand which is the small molecule that is controlled by the haptic stylus. Proteins are large molecules important to most processes found in living organisms. They have several different functions, for example aiding certain chemical reactions or participating in signal transduction. A protein is made up of a chain of amino acids. Residue is the word normally used for an amino acid in the context of a protein chain. There exist 20 different types of amino acids in nature, but they all share some common features. They 5.

(16) 6. Background. Figure 2.1: The PHANTOM Desktop haptic device.. .  . Figure 2.2: Ligand and protein..

(17) 2.3 Secondary structure. (a) α-helix. 7. (b) β-sheet. Figure 2.3: The main secondary structure-types shown as a combination of ribbon and sticks representations. Images are created using WebLab Viewer. all contain a central carbon atom called the α-carbon to which a carboxyl and an amine group are bound. The α-carbon also binds to a side-chain that is specific to each amino acid type. Protein-ligand docking involves finding the binding site where the ligand fits into the protein, which happens when it is placed at the bottom of an energy-well produced by various intermolecular forces between the protein and the ligand. Docking is often used to test drugs, in which case the ligand is the active molecule from the drug. When a ligand binds to a protein it may alter the function of the protein. In many cases, this is the mechanism by which drugs cause their effects on the body. In CFF the visual representation of the docking procedure is augmented by applying haptic forces to the stylus in proportion to the intermolecular forces between protein and ligand.. 2.3. Secondary structure. The structure of a protein can be described at different levels. The primary structure consists of the sequence of amino acids that build up the protein. The secondary structure describes how local structures are formed by non-covalent interactions between residues. The type of interaction that hold together the secondary structures is the hydrogen bond, which is an attractive force between the amine group of one residue and the carbonyl group of another residue (N-H <-> O=C). The protein chain folds into different secondary structures depending on the sequence of residues. The two most general are α-helix as shown in figure 2.3a and β-sheet as shown in figure 2.3b. A helix structure is made up of several turns in which residues three, four or five residues apart in the chain are bonded to.

(18) 8. Background. each other by hydrogen bonds. A sheet structure is made up of several ladders in which consecutive residues on adjacent strands hydrogen bond to each other in a pairwise fashion. It is important to visualize the secondary structure as experts use it to orient themselves in the protein. The structure creates reference points that aid in describing positions in, and orientations of a protein. It also allows biochemists to visually deduce relationships between different proteins that the amino acid chain alone doesn’t. Certain combinations of secondary structures also provide information about the function of the protein. There is also a tertiary and quaternary structure, which describes the threedimensional position of the atoms and the position and interactions between the protein chains of proteins that consist of more than one subunit, respectively. These two structure-levels are not considered in this report.. 2.4. Computer graphics. The Graphics Processing Unit (GPU) is the video card in a computer. In this thesis the OpenGL graphics Application Programming Interface (API) was used to interface with the graphics card, thus many of the graphics-related words are the ones used in OpenGL. Modern GPUs are controlled by small programs called shaders. There are currently two main types of shaders, fragment- and vertex-shaders. Fragmentshaders are sometimes also known as pixel-shaders and control the appearance of individual pixels. The OpenGL rendering API is designed as a state machine. The API keeps track of states, such as the current color, and uses a set state for all renderings until it is changed by a call to the API. The resulting combination of all these settings is referred to as the rendering state in this report.. 2.5. Molecule rendering. There are several common ways of visualizing molecules. The ones referred to in this report includes balls and sticks, Van der Waals, sticks, backbone and ribbon. Balls and sticks as shown in figure 2.4a renders the atoms in the molecule as balls, while sticks or cylinders are used to represent the covalent bonds between the atoms. The Van der Waals representation as shown in figure 2.4b only renders the atoms and ignores the bonds between the atoms. Atoms are rendered as spheres with a radius defined by the Van der Waals radius for each atom type. The sticks representation as shown in figure 2.4c works like the balls and sticks representation but without the balls, thus showing only the bonds between the atoms. The backbone representation as shown in figure 2.4d renders sticks between the α-carbon atoms in the molecule, thus showing the way the center of the protein chain is folded in space..

(19) 2.5 Molecule rendering. 9. (a) Balls and sticks. (b) Van der Waals. (c) Sticks. (d) Backbone. Figure 2.4: Different rendering-modes all colored by secondary structure. Red atoms belong to α-helix structures, cyan to β-sheets and white is undefined.. Figure 2.5: Ribbon representation of a protein. Image created using WebLab Viewer.

(20) 10. Background. The ribbon representation as shown in figure 2.5 works like the backbone representation in the sense that it only visualizes the center of the protein chain. The ribbon representation, however, renders the backbone by fitting splines to the α-carbons instead of sticks, thus achieving a smoother curve..

(21) Chapter 3. Previous work As the code-base used in this thesis has been developed during both a previous thesis [1] and an ongoing research project [2][3] there was already a substantial amount of functionality implemented. Many things in the original framework does not affect the work done in this thesis, however, some previous implementations have influenced the current design. The framework contained a structure for molecules and atom-data which incorporated most of the data required for implementing the functionality requested for this thesis. The visual rendering functionality of the framework was functional but basic since this was never the main focus of any of the previous work. Molecule representations such as balls and sticks, sticks, backbone and Van der Waals were already implemented. A way of culling or making transparent parts of the molecule was provided and could be manipulated by moving a plane using the haptic stylus. Rendering transparent parts of the molecule was, however, performed incorrectly.. 11.

(22)

(23) Chapter 4. Problem description The original list of requested features was narrowed down to the most prioritized in order to make the time required for implementation feasible. The features addressed in this thesis include: • Secondary structure assignment Determining the secondary structure from the available protein description data. This also requires modification of the molecule data-structure in order to properly support the secondary structure information. • Spatial indexing of the molecule A structure for spatial lookup in the molecule is needed to optimize various algorithms. • Transparency support In order to draw transparent surfaces correctly, there has to be some functionality for sorting the surfaces according to their depth in the scene. The transparency is important in order to make sure the ligand is not obscured by the protein while docking. • Graphical user interface (GUI) to access the available features in the program No proper graphical user interface was available in the application. An intuitive user interface is needed in order to make the program more useful as a teaching tool. • Easy loading of data sets and ligands In addition to the general graphical user interface there needs to be a way to easily select the data-sets to visualize from within the haptic interface. • Addition of crosses for non-bonded atoms in sticks mode When using a molecule-representation that only displays atom interconnections there need to be a way of visualizing isolated atoms. 13.

(24) 14. Problem description. In addition to these specific problem-areas there are other requested features with lower priority that have influenced several of the design-choices. Examples include various additional rendering-modes as well as more advanced filtering of the molecule..

(25) Chapter 5. Overview of methods 5.1. Secondary structure. There are several different algorithms for defining the secondary structure of a protein. The most commonly used secondary structure assignment technique is called Define Secondary Structure of Proteins (DSSP). This algorithm was presented by Wolfgang Kabsch and Christian Sander [5] in 1983. There are other algorithms that solves the same problem, such as P-Curve, DEFINE and STRIDE.. 5.1.1. DSSP. The DSSP algorithm determines the secondary structure based on the hydrogenbonds within the protein. In the DSSP algorithm a hydrogen-bond is said to exist if the carbon-oxygen pair in one residue is within a certain distance from the nitrogen-hydrogen pair of another residue. At the lowest level the DSSP algorithm looks for single hydrogen bonds and tries to classify them. Connections between residues three, four or five steps apart are classified as turns. A pair of hydrogen bonds that connect a residue in one part of the protein to one or two closely spaced residues in a completely different part of the protein amino acid sequence are classified as a bridge. From these features higher-level structures are determined. Several turns in a row are classified as a helix. Several bridges are grouped together into a ladder which are then grouped together into a sheet.. 5.2. Spatial indexing. Spatial indexing is a way to map position to data. It is used to optimize algorithms that operate based on the position of objects by avoiding an exhaustive search for things such as finding neighbors. Two algorithms were considered for this project. A grid- and an octreeapproach. These are the most common ways of spatial indexing. 15.

(26) 16. Overview of methods. (a) Uniform grid. (b) Octree. Figure 5.1: Spatial indexing structures.. 5.2.1. Uniform grid. A uniform grid is a very simple form of spatial subdivision. A volume is split several times along its x, y and z-axis into several evenly sized cells as shown in figure 5.1a. Any object residing in one of the cells is added to that cell’s list of objects. When searching for objects in a certain subvolume the first step is to decide which cells are entirely outside, entirely inside and on the edge of the volume. All objects in the cells entirely inside the volume can then be included. The objects in the cells that are on the border of the volume must be included on a per object basis, since the border-cells may contain objects both inside and outside the subvolume.. 5.2.2. Octree. An octree, as seen in figure 5.1b is a hierarchical structure used for spatial indexing. An octree is the three-dimensional analog to a two-dimensional structure called quadtree. In two dimension a split only creates four sub-regions, thus the quad prefix. To construct an octree a volume is recursively split into eight sub-volumes until each sub-volume contains only one object, or a predefined maximum tree-depth is reached. The splits are orthogonal to the x, y and z axis. In the implementation considered for this thesis the splits go through the center of the volume in such a way that all eight sub-volumes get the same size. In order to search for objects within a certain radius from a point the tree needs to be traversed from the top down, disregarding objects in regions that are completely outside the sphere of interest.. 5.3. Transparency. The use of transparency when rendering real-time graphics can be a bit tricky. The normal way of handling hidden surface removal, the z-buffer, isn’t useful in.

(27) 5.3 Transparency. 17. Figure 5.2: Intersecting surfaces. an unmodified state. The problem with the z-buffer stems from the fact that the buffer only remembers the closest fragment for each pixel in the image. When a semitransparent fragment is drawn it sets the current z-depth for that pixel to the one of the fragment. This means that the GPU will cull any future fragments drawn to the same position but with a larger z-depth than the one in the z-buffer, even thou they should be partially visible. Two different ways of working around this problem have been considered, painter’s algorithm and depth-peeling.. 5.3.1. Painter’s algorithm. Painter’s algorithm provides a well known way of solving the transparency problem by sorting all surfaces in the scene back to front before drawing them. This provides an implicit hidden surface removal since any surface closer to the camera overwrites any previously drawn surfaces. If a surface is transparent it will just blend the previously written pixels with the current ones. The algorithm doesn’t know how to handle overlapping or intersecting surfaces such as the triangles shown in figure 5.2. Problematic surfaces must therefor be split to achieve unambiguous depth sorting. If working with a scene-graph there is also a potential problem in having to draw the surfaces in a non-scene-graph-native order. A scene-graph normally tries to group rendering of similar surfaces in order to avoid costly rendering-state switches. This grouping cannot be done when using the painter’s algorithm since the surfaces must be rendered strictly from back to front. An additional problem is introduced by fragment-shaders. A fragment-shader may change its own depth. Since this is done on the GPU any sorting performed previously in software may be rendered invalid.. 5.3.2. Depth-peeling. Depth-peeling [4] [6] is a way of rendering transparent objects without having to explicitly sort the surfaces..

(28) 18. Overview of methods. The algorithm uses a multi-pass approach where the entire scene is rendered several times, each time peeling off another layer of transparency. Depth-peeling employs a fragment-shader on the GPU which means that all rendering must use this shader. The algorithm requires that the GPU supports an extra depth-buffer, or a frame-buffer with the same bit-depth as a depth-buffer. It can also be very slow if there are many layers of transparent surfaces to render since this requires many passes through the rendering-code. The rendering order of the surfaces does not affect the final outcome. This means that the surfaces can be rendered in the order most suitable for the scenegraph.. 5.4. Filtering. The application needs to be able to cull certain parts of the molecule, for example to improve visibility. In this report this operation is called filtering. Two approaches were considered for dealing with this problem. Either per object or per fragment filtering. They are, in theory, almost the same algorithm, just operating at different scales. However, the implementation is quite different.. 5.4.1. Per object filtering. The choice whether to cull a certain object or not can be done on a per-object level. This can be done without any special hardware support on the graphics-card. It is also very easy to integrate into an existing rendering pipeline. Once an object has been culled it does not have to be sent to the GPU, which results in a slight performance improvement. One problem with this kind of filtering appears when the filtered data-set is to be rendered in a way that does not draw the individual objects separately. For example, if two objects make up one visual element and one of the objects is culled, then the rendering may produce an unexpected result. There is also the question of what position in the object to use for the culling test. Several options are available, where the simplest case probably is to use the center-point of the object. Other options are to cull all objects having any part inside the culled volume, or all objects having more of their volume inside the culled volume than outside.. 5.4.2. Per fragment filtering. The other option considered for filtering is by doing it on a per-fragment basis. This type of filtering does not result in any significant performance-gains when parts of the molecule are filtered out. The only possibility is small speedups on the GPU since the pipeline might be able to drop fragments early and avoid unnecessary shading-calculations. These filters can only be implemented properly using fragment-shaders on the GPU..

(29) 5.4 Filtering. 19. While the per object filtering has the option of culling based on other things than position, the per fragment filtering isn’t able to do so in an efficient manner. By default a shader does not have access to other parameters than the position of each fragment, so all other parameters have to be sent along explicitly to the GPU in the form of a texture or a color..

(30)

(31) Chapter 6. Implementation The application is currently implemented against the H3D haptic scene-graph framework1 . However most parts of the application have been designed to enable easy migration to other frameworks should it prove necessary. Most of the actual functionality is implemented in scenegraph-agnostic classes, which are then sub-classed in order to create the actual H3D scenegraph nodes. New user interface components are implemented in H3D-specific code in their entirety since this allows us to use the H3D user interface (UI) add-on library. The UI functionality also makes use of the X3D field network for most of the implementation. X3D is the scene description standard used by H3D.. 6.1. Determining the secondary structure. Since the DSSP algorithm - as discussed in section 5.1.1 - is more or less the standard way of defining the secondary structure for proteins, it was selected for the application. In order to avoid unnecessary introduction of bugs and extra work, the choice was made to use the original DSSP program directly instead of reimplementing the algorithm. The DSSP program generates secondary structure information in the form of a dssp-file. Dssp-files include various chemical data about each residue in the protein. For this project only a small subset of the available information is interesting. Each line in the dssp-file contains the information for one residue and appear in the same order as the residues in the protein. Listing 6.1: Excerpt from a dssp-file. 215 216 217 218. 218 219 220 221. A A A A. V S S E. E E H H. −l > −l > S+ > S+. 147 148 0 0. 1 www.h3dapi.org. 21. 0B 0 0B 12 0 44 0 120. ... ... ... ....

(32) 22. Implementation. Listing 6.1 shows a few lines from a dssp-file. The lines are truncated to show only the data of interest to us. Dssp files are column-based, much like the Protein Data Bank (PDB) format, a widely used file format for protein structure description. The first column in the listing contains the residue number used internally by the dssp-program, the second column is the corresponding residue number from the pdb-file. The column most interesting to this project is the fifth, which contains a summary of the secondary structure for the current residue. The first two lines in the listing have a summary of E which stands for extended. Extended refers to an extended set of bridges. A bridge is made up of two hydrogen bonds as described in section 5.1.1. The two last lines have a summary of H, which signifies that the residues are a part of an α-helix. The DSSP program is called internally by our program and the output is saved to a temporary file which is then parsed. Thus avoiding any need for a separate preprocessing stage. The parser only considers helices and beta-sheets, all other structures are ignored. Helices are easy to parse since all residues in the helical structure appear on consecutive lines in the file. A beta-sheet typically features connections between residues in different parts of the protein. This requires a more advanced system for keeping track of which structure a beta-sheet residue belongs to. A residue that has been classified as belonging to a beta-sheet has one or two connected residues. When parsing a residue-entry from the dssp-file, the connected residues are examined in order to determine if they are part of an established beta-sheet, in which case the parsed residue is also assigned to that structure. If two previously defined sheets are connected by a residue the two structures are merged.. 6.2. Rendering. The requested features for the application include several different types of filtering of the dataset, as well as being able to render different parts of the molecule using different rendering modes. To implement the desired features a new renderingsystem is required. To produce a codebase better suited for inclusion of new rendering-modes and to make the implementation of the simultaneous rendering-modes easier the data and rendering of the molecule is separated. Rendering-specific variables such as size and color are reimplemented as pluggable parameters for the rendering-modes that used them. The old atom-structure with color and size stored along with the rest of the atom data is kept as an intermediate solution to avoid problems with legacy code.. 6.2.1. Transparency. One of the requirements for the project is to implement a way of properly rendering transparent objects. As stated in the problem description this is important in order to make sure that the ligand is visible at all times. The hard part is not to render transparent objects, but to make sure they are rendered in the correct order..

(33) 6.2 Rendering. 23. The choice of algorithm to achieve proper transparency stood between the painter’s algorithm as outlined in section 5.3.1, and depth-peeling as outlined in section 5.3.2. The choice was made to use painter’s algorithm for a variety of reasons. First of all, employment of the depth-peeling algorithm would require a more extensive rewriting of the rendering-code. Another disadvantage of using depth-peeling for this project is that since there, for example in the Van der Waals representation, are so many transparent layers in the scene, the renderer would have to go through the scene-graph an unreasonable amount of times to be able to draw them all. The data-structure does not have a good separation between transparent and opaque objects, meaning that the entire structure - both transparent and opaque objects - might have to be traversed for each peeling pass. A scene in CFF typically includes two molecules, the protein and the ligand. These are in different nodes in the scene-graph, with one node for each molecule. The individual atoms in a molecule are represented in a manner separate from the scene-graph. To enable depth sorting with several molecules simultaneously, as is necessary if both protein and ligand are transparent, a need arises to bridge the gap between the rendering-functions of the different nodes. This is done using a separate transparency-rendering node placed last in the scene-graph to make sure the transparent surfaces are rendered after the opaque ones. The transparencyrendering node works using lists of renderable objects. When a node wants to draw transparent renderable objects it pushes them to the transparency-rendering node instead of drawing them directly. A renderable object needs to be able to render itself. It must also be depth-sortable, which requires some knowledge of its spatial position. In the currently implemented rendering-modes a renderable object is typically an atom. Depth sorting is currently implemented using only the center-point of each renderable, but provisions have been taken to allow for proper per-triangle sorting when that becomes necessary. Since the rendering is being delayed, the rendering-state of the scene-graph needs to be saved for each renderable. If each renderable was to handle its own state, rendering would include a state switch for every renderable in the transparency-rendering node. To avoid this a separate state-switching object is implemented to handle changes in the rendering-state. If two consecutive renderables use the same state the state-switcher notice this and can then avoid a state change. The usefulness of the transparency-rendering node is not restricted to only molecule-rendering since it has potential to do transparency-rendering for other H3D-nodes as well.. 6.2.2. Filtering. The filters are implemented as a layer between the data-set and the rendering. Filtering is done using a tree of filters that work on the atom level. This corresponds to per object filtering as discussed in section 5.4.1. Per object filtering was chosen since the previous rendering-system use this method, and that functionality needed to be ported to the new renderer..

(34) 24. Implementation. .  

(35)  .   .   .  

(36) 

(37)  

(38)  .  

(39) 

(40)    .  

(41)  .  

(42)    

(43)   . Figure 6.1: Current filter structure. Filters in the filter-tree does not just handle the culling of atoms, but also keeps track of its inputs and outputs and the set of atoms that currently pass the filter test. In order to avoid the need for manual memory allocation a reference-counting implementation is used for the filter-tree. The ownership-relation in the tree defaults to the parent owning its children, thus when a parent is deleted the deletion is propagated down through the tree. All children carry a weak reference to its parent in order to avoid reference-counting loops. There are exceptions to the direction of the ownership. All render-modes essentially behave like leaves in the filter-tree, but they should not be deleted when their parent is removed. The render-modes also normally use a dummy-filter as input before any other has been set. This means that the they need to own their dummy-filter since they have the only reference to it. The current implementation tries to emulate the previous behavior which did not make use of any tree structure for the filters. The filter structure is thus mostly non-tree-like as shown in figure 6.1, where the only branching-point in the actual tree is the backbone-filter that strips all the non-alpha-carbon atoms from the molecule before sending it to a sticks-renderer. As new types of filters are added to the system the structure is expected to make use of branches. The filter-tree has its root in an adapter-filter. The adapter has the interface of a filter but all of its atoms are taken unaltered from the underlying molecule data. This is followed by a plane-filter using parameters set through the same on-screen controls as the previous plane-filter implementation. The plane can be moved by selecting it with the haptic stylus and dragging it to the desired position and orientation. One problem with the per-atom filtering technique is that its suitability depends on the way the data is rendered. If the selected rendering-mode, for example, renders the interconnections between the atoms in the dataset - as is done when.

(44) 6.3 User interface. 25. using a balls and sticks rendering mode, introduced in section 2.5 - the connections between culled and un-culled atoms pose a problem. Per-atom and per-fragment filtering is not mutually exclusive. Per-atom filtering could for example be used to remove large parts of the molecule in order to speed up the rendering, while per-fragment filtering could be used to create a clean cut along the border of the culled volume. This will probably be necessary in order to do proper culling for more advanced rendering-modes. For the current rendering-modes however, the per-atom filtering is adequate, and requires less changes to the original rendering-code. A filter generally works by applying a predicate to determine whether an atom is visible based on its position. Among the currently implemented filters the filterplane is one example since it discards atoms based on whether they are on a certain side of a plane. The predicate can also be based on data other than position, such as atom type, as is done for the backbone-representation. The filter-node structure defines an interface that filters must adhere to. The actual implementation of a filter can be done in the way that is best suited for the filter at hand. A filter that requires lots of calculations to determine the included atoms could, for example, cache the processed atoms to increase performance. If the atoms in the data-set are updated an update-event is sent down through the filter-tree. This allows any cache-based filters to update themselves only when needed, thus achieving lazy evaluation.. 6.2.3. Rendering-modes. The different options for rendering the molecule, such as Van der Waals or balls and sticks as shown in figures 2.4b and 2.4a, are split up into different classes, each class handling one representation. This simplifies adding new renderers since it only involves the creation of a new class. In the current implementation the molecule class handles all possible renderers for legacy reasons. Instead of operating on the entire molecule, as the rendering was done previously, the rendering-modes now operate on the atom-filters discussed in section 6.2.2. This allows for connecting different filters to different render-modes. A way to easily control various aspects of the rendering, such as color or size, is implemented using function-objects that take an atom as input and produce an output value that controls the rendering of that particular atom. The different coloring-modes for the atoms are implemented as atom-to-color maps. It is also possible to implement a form of filtering of the atoms using color-maps by setting the alpha-component to zero for undesired atoms. Since the mappers are implemented as function-objects creating a tree of mapping functions is easy. This allows for combinations of different coloring modes on the same molecule.. 6.3. User interface. The only user interface component implemented for this thesis is a file-selection dialog. The rest of the controls in the application also need a graphical user-.

(45) 26. Implementation. Figure 6.2: The file-selection dialog. interface, but the currently visualized protein is the setting that is most awkward to change since it requires editing a file and restarting the visualization. In order to avoid having to leave the haptic environment to select the desired protein description file, a decision was made to implement a file-selection dialog as a node for the scene-graph. The file-selector, seen in figure 6.2, is designed to work similar to normal fileselectors found in any 2D GUI widget-set. This is done to simplify implementation as well as to shorten the time required for learning the interface. All files in the currently selected directory appears in a list. If the directory contains more files than can be displayed, a scroll-bar is provided on the right side for navigating the list. The “/” symbol is used to signify that a file is in fact a directory, as is normal on UNIX style operating systems. An icon depicting a folder will probably be more natural for a Windows-user, but the underlying UI-library does not allow for any easy implementation of icons.. 6.4. Spatial index. The type of spatial index chosen for implementation is a grid, as discussed in section 5.2.1. A grid is considered to be adequate for this project and easier to implement than an octree. The grid is sized according to the bounding-box of the molecule. The volume defined by the bounding-box is split into several cells of equal size and shape. Each cell has a list of objects whose center is contained within the cell. Using only the center of each atom simplifies the implementation since it allows for the assumption that each atom only fits into one cell. When the atoms in the molecule are added or when their position changes they are sorted into the correct cell of the grid by comparing their position against.

(46) 6.4 Spatial index. 27. the number of cells and the cell-size in the x, y and z dimensions to determine which cell to put the atom in. The atom’s coordinate is downscaled slightly before the comparison to avoid numeric errors when handling atoms on the edge of the bounding-box..

(47)

(48) Chapter 7. Results All the requested features, described in chapter 4, have been implemented, with the exception of the graphical user interface that has been extended with a file-selector but not completed. The file-selector proved to be somewhat difficult to use, especially the scrollbar. This is partially due to the use of the H3D UI library, currently only having very rudimentary functionality, and partially due to the use of traditional desktop interaction styles instead of concepts more suited for haptic environments. The DSSP parser and the handling of the additional secondary structure data produces the anticipated visual results for all tested protein molecules. The improvement in rendering of transparent objects is shown in figure 7.1. A plane has been positioned in the middle of the molecule making the left part transparent. Note how the density increases in the middle of the protein and the lack of order-dependent artifacts in the rightmost image. The new rendering system including filters and mappers has a rendering-speed in parity with the old system, while at the same time being more structured and extendable.. (a) Without sorting. (b) With sorting. Figure 7.1: Rendering of transparent atoms. 29.

(49) 30. Results. Table 7.1 lists the frame rate of the different visual representations with and without depth sorting. Several tests were made and the average of the frame rates for each representation was calculated. The frame rate drops somewhat when the depth sorting is enabled. The only visual representation where this doesn’t occur is the Van der Waals representation. A rise in frame rate when using depth sorting is unexpected and probably is the result of a bug in the non-sorting code. Representation: Balls and sticks Backbone Sticks Van der Waals. Sorted frame rate (fps): 10 99 31 62. Unsorted frame rate (fps): 31 102 37 47. Table 7.1: Performance of the different visual representations. Table 7.2 lists the number of separate objects rendered for the test protein using the different visual representations. These objects are considered atomic in the sorting algorithm. It is therefor these numbers that define the number of elements that need to be sorted. As opposed to the other representations the balls and sticks mode uses two types of renderable objects rather than only one. This means that the rendering state has to change several times during the transparency rendering, slowing down the algorithm somewhat. Representation: Balls and sticks Backbone Sticks Van der Waals. Number of objects: 5173 257 2663 2522. Table 7.2: Number of renderable objects in the different representations..

(50) Chapter 8. Future work The majority of desired improvements detailed as a result of the evaluation prior to the start of this thesis have yet to be implemented. In addition, some features implemented in this thesis could be upgraded in an extra iteration of improvements. Reimplementing the secondary structure extraction could be considered to avoid the licensing fee for the DSSP software in case this application is to be used in a commercial setting. The file-selector widget is currently modeled after normal 2D desktop style fileselectors. The user-friendliness of this widget could be improved by adopting a more haptic-aware approach. For example adding haptic feedback to button-press events. The scrollbar would benefit from haptic forces that pushes the haptic stylus toward the center of the scrollbar, thus making it harder to slide off. The different visual representations of molecules are still somewhat tightly coupled to the molecule data-structure. This should be further decoupled to achieve better extendability. The spatial division is currently not used for anything in the application, but is provided for future extensions such as spatial selection tools. One problem with the current implementation of the spatial division is that it does not propagate down through the atom-filter-trees, thus making it hard to use for optimizing spatial filtering of the molecule. The transparency sorting as it is currently implemented works well for renderingmodes where each atom is represented by a sphere, since this allows the sorting algorithm to work on a reasonably granular level. Rendering-modes such as ribbons would require sorting on a ribbon-section or polygon-level, which introduces additional problems. For this rendering-mode and others like it a depth-peeling method of transparency rendering would likely be better suited and could be used as an alternative to explicit back to front surface sorting.. 31.

(51)

(52) Bibliography [1] Petter Bivall Persson. Chemical interactions through force feedback. Master’s thesis, Linköping University, February 2004. [2] Petter Bivall Persson, Matthew D. Cooper, Lena A.E. Tibell, Shaaron Ainsworth, Anders Ynnerman, and Bengt-Harald Jonsson. Designing and evaluating a haptic system for biomolecular education. In Proceedings of IEEE Virtual Reality 2007, pages 171–178, Charlotte, North Carolina, USA, March 2007. IEEE. [3] Petter Bivall Persson, Lena A. E. Tibell, Matthew D. Cooper, Anders Ynnerman, and Bengt-Harald Jonsson. Evaluating the effectiveness of haptic visualization in biomolecular education - feeling molecular specificity in a docking task. In Suan Yoong, Mokhtar Ismail, Ahmad Nurulazam Md. Zain, Fatimah Salleh, Fong Soon Fook, Lim Chap Sam, and Melissa Ng Lee Yan, editors, Proceedings of 12th IOSTE Symposium, pages 745–752, Malaysia, July-August 2006. Universiti Science Malaysia. [4] C. Everitt. Interactive order-independent transparency, 2001. [5] W. Kabsch and C. Sander. Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers, 22(12):2577–2637, December 1983. [6] Abraham Mammen. Transparency and antialiasing algorithms implemented with the virtual pixel maps technique. IEEE Computer Graphics and Applications, 09(4):43–55, 1989.. 33.

(53)

(54)

References

Related documents

För att uppskatta den totala effekten av reformerna måste dock hänsyn tas till såväl samt- liga priseffekter som sammansättningseffekter, till följd av ökad försäljningsandel

Från den teoretiska modellen vet vi att när det finns två budgivare på marknaden, och marknadsandelen för månadens vara ökar, så leder detta till lägre

The increasing availability of data and attention to services has increased the understanding of the contribution of services to innovation and productivity in

Generella styrmedel kan ha varit mindre verksamma än man har trott De generella styrmedlen, till skillnad från de specifika styrmedlen, har kommit att användas i större

I dag uppgår denna del av befolkningen till knappt 4 200 personer och år 2030 beräknas det finnas drygt 4 800 personer i Gällivare kommun som är 65 år eller äldre i

Detta projekt utvecklar policymixen för strategin Smart industri (Näringsdepartementet, 2016a). En av anledningarna till en stark avgränsning är att analysen bygger på djupa

DIN representerar Tyskland i ISO och CEN, och har en permanent plats i ISO:s råd. Det ger dem en bra position för att påverka strategiska frågor inom den internationella

Av 2012 års danska handlingsplan för Indien framgår att det finns en ambition att även ingå ett samförståndsavtal avseende högre utbildning vilket skulle främja utbildnings-,