Hilfe Warenkorb Konto Anmelden
 
 
   Schnellsuche   
     zur Expertensuche                      
Similarity Search - The Metric Space Approach
  Großes Bild
 
Similarity Search - The Metric Space Approach
von: Pavel Zezula, Giuseppe Amato, Vlastislav Dohnal, Michal Batko
Springer-Verlag, 2006
ISBN: 9780387291512
227 Seiten, Download: 12075 KB
 
Format:  PDF
geeignet für: Apple iPad, Android Tablet PC's Online-Lesen PC, MAC, Laptop

Typ: A (einfacher Zugriff)

 

 
eBook anfordern
Inhaltsverzeichnis

  Contents 7  
  Foreword 13  
  Preface 15  
  Acknowledgments 17  
  PART I METRIC SEARCHING IN A NUTSHELL 18  
     Chapter 1 FOUNDATIONS OF METRIC SPACE SEARCHING 22  
        1. The Distance Searching Problem 23  
        2. The Metric Space 25  
        3. Distance Measures 26  
           3.1 Minkowski Distances 27  
           3.2 Quadratic Form Distance 28  
           3.3 Edit Distance 29  
           3.4 Tree Edit Distance 30  
           3.5 Jaccard's Coefficient 30  
           3.6 Hausdorff Distance 31  
           3.7 Time Complexity 31  
        4. Similarity Queries 32  
           4.1 Range Query 32  
           4.2 Nearest Neighbor Query 33  
           4.3 Reverse Nearest Neighbor Query 34  
           4.4 Similarity Join 34  
           4.5 Combinations of Queries 35  
           4.6 Complex Similarity Queries 35  
        5. Basic Partitioning Principles 37  
           5.1 Ball Partitioning 37  
           5.2 Generalized Hyperplane Partitioning 38  
           5.3 Excluded Middle Partitioning 38  
        6. Principles of Similarity Query Execution 39  
           6.1 Basic Strategies 39  
           6.2 Incremental Similarity Search 42  
        7. Policies for Avoiding Distance Computations 43  
           7.1 Explanatory Example 44  
           7.2 Object-Pivot Distance Constraint 45  
           7.3 Range-Pivot Distance Constraint 47  
           7.4 Pivot-Pivot Distance Constraint 48  
           7.5 Double-Pivot Distance Constraint 50  
           7.6 Pivot Filtering 51  
        8. Metric Space Transformations 52  
           8.1 Metric Hierarchies 53  
              8.1.1 Lower-Bounding Functions 53  
           8.2 User-Defined Metric Functions 55  
              8.2.1 Searching Using Lower-Bounding Functions 55  
           8.3 Embedding Metric Space 56  
              8.3.1 Embedding Examples 56  
              8.3.2 Reducing Dimensionality 57  
        9. Approximate Similarity Search 58  
           9.1 Principles 58  
           9.2 Generic Algorithms 61  
           9.3 Measures of Performance 63  
              9.3.1 Improvement in Efficiency 63  
              9.3.2 Precision and Recall 63  
              9.3.3 Relative Error on Distances 65  
              9.3.4 Position Error 66  
        10. Advanced Issues 67  
           10.1 Statistics on Metric Datasets 68  
              10.1.1 Distribution and Density Functions 68  
              10.1.2 Distance Distribution and Density 69  
              10.1.3 Homogeneity of Viewpoints 71  
           10.2 Proximity of Ball Regions 72  
           10.3 Performance Prediction 75  
           10.4 Tree Quality IMeasures 77  
           10.5 Choosing Reference Points 80  
     Chapter 2 SURVEY OF EXISTING APPROACHES 84  
        1. Ball Partitioning IMethods 84  
           1.1 Burkhard- Keller Tree 85  
           1.2 Fixed Queries Tree 86  
           1.3 Fixed Queries Array 87  
           1.4 Vantage Point Tree 89  
           1.5 Excluded Middle Vantage Point Forest 92  
        2. Generalized Hyperplane Partitioning Approaches 93  
           2.1 Bisector Tree 93  
           2.2 Generalized Hyperplane Tree 94  
        3. Exploiting Pre- Computed Distances 95  
           3.1 AESA 95  
           3.2 Linear AESA 96  
           3.3 Other IMethods 97  
        4. Hybrid Indexing Approaclies 98  
           4.1 Multi Vantage Point Tree 98  
           4.2 Geometric Near-neighbor Access Tree 99  
           4.3 Spatial Approximation Tree 102  
           4.4 M-tree 104  
           4.5 Similarity Hashing 105  
        5. Approximate Similarity Search 106  
           5.1 Exploiting Space Transformations 106  
           5.2 Approximate Nearest Neighbors with BBD Trees 107  
           5.3 Angle Property Technique 109  
           5.4 Clustering for Indexing 111  
           5.5 Vector Quantization Index 112  
           5.6 Buoy Indexing 114  
           5.7 Hierarchical Decomposition of IMetric Spaces 114  
              5.7.1 Relative Error Approximation 115  
              5.7.2 Good Fraction Approximation 115  
              5.7.3 Small Chance Improvement Approximation 115  
              5.7.4 Proximity-Based Approximation 116  
              5.7.5 PAC Nearest Neighbor Search 116  
  PART II METRIC SEARCHING IN LARGE COLLECTIONS OF DATA 118  
     Chapter 3 CENTRALIZED INDEX STRUCTURES 122  
        1. M-tree Family 122  
           1.1 The M-tree 122  
           1.2 Bulk-Loading Algorithm of IVl-tree 126  
           1.3 Multi-Way Insertion Algorithm 129  
           1.4 The Slim Tree 130  
              1.4.1 Slim-Down Algorithm 131  
              1.4.2 Generalized Slim-Down Algorithm 133  
           1.5 Pivoting M-tree 135  
           1.6 The M+-tree 138  
           1.7 The M2 -tree 141  
        2. Hash-based metric indexing 142  
           2.1 The D-index 143  
              2.1.1 Insertion and Search Strategies 146  
           2.2 The eD-index 148  
              2.2.1 Similarity Self-Join Algorithm with eD-index 150  
        3. Performance Trials 153  
           3.1 Datasets and Distance Measures 154  
           3.2 Performance Comparison 155  
           3.3 Different Query Types 157  
           3.4 Scalability 158  
     Chapter 4 APPROXIMATE SIMILARITY SEARCH 162  
        1. Relative Error Approximation 162  
        2. Good Fraction Approximation 165  
        3. Small Chance Improvement Approximation 167  
        4. Proximity-Based Approximation 169  
        5. PAC Nearest Neighbor Searching 170  
        6. Performance Trials 171  
           6.1 Range Queries 172  
           6.2 Nearest Neighbors Queries 173  
           6.3 Global Considerations 176  
     Chapter 5 PARALLEL AND DISTRIBUTED INDEXES 178  
        1. Preliminaries 178  
           1.1 Parallel Computing 179  
           1.2 Distributed Computing 180  
              1.2.1 Scalable and Distributed Data Structures 180  
              1.2.2 Peer-to-Peer Data Networks 181  
        2. Processing M-trees with Parallel Resources 181  
           2.1 CPU Parallelism 182  
           2.2 I/O Parallelism 182  
           2.3 Object Declustering in M-trees 184  
        3. Scalable Distributed Similarity Search Structure 184  
           3.1 Architecture 185  
           3.2 Address Search Tree 186  
           3.3 Storage Management 186  
              3.3.1 Bucket Splitting 187  
              3.3.2 Choosing Pivots 188  
           3.4 Insertion of Objects 188  
           3.5 Range Search 189  
           3.6 Nearest Neighbor Search 190  
           3.7 Deletions and Updates of Objects 191  
           3.8 Image Adjustment 192  
           3.9 Logarithmic Replication Strategy 194  
           3.10 Joining the Peer-to-Peer Network 195  
           3.11 Leaving the Peer-to-Peer Network 195  
        4. Performance Trials 196  
           4.1 Datasets and Computing Infrastructure 197  
           4.2 Performance of Similarity Queries 197  
              4.2.1 Global Costs 198  
              4.2.3 Comparison of Search Algorithms 205  
           4.3 Data Volume Scalability 206  
  Concluding Summary 210  
  References 214  
  Author Index 228  
  Index 232  
  Abbreviations 236  


nach oben


  Mehr zum Inhalt
Kapitelübersicht
Kurzinformation
Inhaltsverzeichnis
Leseprobe
Blick ins Buch
Fragen zu eBooks?

  Medientyp
  eBooks
  eJournal
  alle

  Navigation
Belletristik / Romane
Computer
Geschichte
Kultur
Medizin / Gesundheit
Philosophie / Religion
Politik
Psychologie / Pädagogik
Ratgeber
Recht
Reise / Hobbys
Sexualität / Erotik
Technik / Wissen
Wirtschaft

  Info
Hier gelangen Sie wieder zum Online-Auftritt Ihrer Bibliothek
© 2008-2024 ciando GmbH | Impressum | Kontakt | F.A.Q. | Datenschutz