|
Contents |
5 |
|
|
Acknowledgments |
9 |
|
|
1 An Introduction to Reconfigurable Computing |
11 |
|
|
1.1 What is RC? |
11 |
|
|
1.2 RC Architectures |
13 |
|
|
1.3 How did RC originate? |
14 |
|
|
1.4 Inside the FPGA |
16 |
|
|
1.5 Mapping Algorithms to Hardware |
17 |
|
|
1.6 RC Applications |
18 |
|
|
1.7 Example: Dot Product |
19 |
|
|
1.8 Further Reading |
20 |
|
|
2 Reconfigurable Logic Devices |
21 |
|
|
2.1 Field-Programmable Gate Arrays |
22 |
|
|
2.1.1 Basic Architecture |
22 |
|
|
Programmable Logic |
24 |
|
|
Routing |
26 |
|
|
Programmable I/O Architectures |
31 |
|
|
2.1.2 Specialized Function Blocks |
32 |
|
|
Embedded Memory |
33 |
|
|
Embedded Arithmetic Logic |
33 |
|
|
High-Speed Serial I/O |
34 |
|
|
Embedded Microprocessors |
34 |
|
|
2.1.3 Programming Architecture |
36 |
|
|
2.2 Coarse-Grained Reconfigurable Arrays |
38 |
|
|
2.2.1 Raw |
39 |
|
|
2.2.2 PipeRench |
40 |
|
|
2.2.3 RaPiD |
42 |
|
|
2.2.4 PACT XPP |
43 |
|
|
2.2.5 MathStar |
45 |
|
|
2.3 Summary |
46 |
|
|
3 Reconfigurable Computing Systems |
47 |
|
|
3.1 Parallel Processing on Reconfigurable Computers |
47 |
|
|
3.1.1 Instruction Level Parallelism |
47 |
|
|
3.1.2 Task Level Parallelism |
49 |
|
|
3.2 A Survey of Reconfigurable Computing Systems |
51 |
|
|
3.2.1 I/O Bus Accelerator |
53 |
|
|
3.2.2 Massively Parallel FPGA array |
55 |
|
|
3.2.3 Reconfigurable Supercomputer |
55 |
|
|
3.2.4 Reconfigurable Logic Co-processor |
57 |
|
|
3.3 Summary |
59 |
|
|
4 Languages and Compilation |
61 |
|
|
4.1 Design Cycle |
61 |
|
|
4.2 Languages |
64 |
|
|
4.2.1 Algorithmic RC Languages |
65 |
|
|
Algorithmic Language Example |
67 |
|
|
4.2.2 Hardware Description Languages (HDL) |
67 |
|
|
A VHDL Example |
69 |
|
|
4.3 High Level Compilation |
70 |
|
|
4.3.1 Compiler Phases |
75 |
|
|
4.3.2 Analysis and Optimizations |
76 |
|
|
4.3.3 Scheduling |
77 |
|
|
4.4 Low Level Design Flow |
78 |
|
|
4.4.1 Logic Synthesis |
79 |
|
|
4.4.2 Technology Mapping |
80 |
|
|
4.4.3 Logic Placement |
81 |
|
|
4.4.4 Signal Routing |
82 |
|
|
4.4.5 Configuration Bitstreams |
83 |
|
|
4.5 Debugging Reconfigurable Computing Applications |
84 |
|
|
4.5.1 Basic Needs for Debugging |
84 |
|
|
Observability: |
84 |
|
|
Controllability: |
84 |
|
|
Execution Control: |
84 |
|
|
Debugging Data Bandwidth: |
84 |
|
|
System Execution Speed: |
85 |
|
|
Instrumentation Costs: |
85 |
|
|
Ease of Use: |
85 |
|
|
4.5.2 Debugging Facilities |
85 |
|
|
4.5.3 Challenges for RC Application Debugging |
94 |
|
|
4.6 Summary |
95 |
|
|
5 Digital Signal Processing Applications |
97 |
|
|
5.1 What is Digital Signal Processing? |
97 |
|
|
5.2 Why Use Recon.gurable Computing for DSP? |
99 |
|
|
5.2.1 Reconfigurable Computing’s Suitability for DSP |
99 |
|
|
5.2.2 Comparing DSP Implementation Technologies |
102 |
|
|
5.3 DSP Application Building Blocks |
106 |
|
|
5.3.1 Basic Operations and Elements |
107 |
|
|
5.3.2 Filtering |
112 |
|
|
5.3.3 Transforms |
113 |
|
|
5.4 Example DSP Applications |
118 |
|
|
5.4.1 Beamforming |
118 |
|
|
5.4.2 Software Radio |
122 |
|
|
5.5 Summary |
127 |
|
|
6 Image Processing |
129 |
|
|
6.1 RC for Image and Video Processing |
129 |
|
|
6.2 Local Neighborhood Functions |
131 |
|
|
6.2.1 Cellular Arrays for Pixel Parallelism |
133 |
|
|
6.2.2 Image Pipelines for Instruction-Level Parallelism |
133 |
|
|
6.3 Convolution |
134 |
|
|
6.4 Morphology |
135 |
|
|
6.5 Feature Extraction |
137 |
|
|
6.6 Automatic Target Recognition |
139 |
|
|
6.7 Image Matching |
141 |
|
|
6.8 Evolutionary Image Processing |
144 |
|
|
6.9 Summary |
149 |
|
|
7 Network Security |
151 |
|
|
7.1 Cryptographic Applications |
151 |
|
|
7.1.1 Cryptography Basics |
152 |
|
|
Symmetric Algorithms |
152 |
|
|
Block Symmetric Algorithms |
153 |
|
|
Stream ciphers, |
154 |
|
|
Asymmetric Algorithms |
154 |
|
|
7.1.2 RC Cryptographic Algorithm Implementations |
156 |
|
|
7.2 Network Protocol Security |
158 |
|
|
7.2.1 RC Network Interface |
158 |
|
|
7.2.2 Security Protocols |
161 |
|
|
7.2.3 Network Defense |
162 |
|
|
7.3 Summary |
165 |
|
|
8 Bioinformatics Applications |
167 |
|
|
8.1 Introduction |
167 |
|
|
8.2 Applications |
169 |
|
|
8.2.1 Genome Assembly |
169 |
|
|
8.2.2 Content-Based Search |
170 |
|
|
8.2.3 Genome Comparison |
170 |
|
|
8.2.4 Molecular Phylogeny |
171 |
|
|
8.2.5 Pattern Matching |
171 |
|
|
8.2.6 Protein Domain Databases |
172 |
|
|
8.3 Dynamic Programming Algorithms |
173 |
|
|
8.3.1 Alignments |
173 |
|
|
8.3.2 Dynamic Programming Equations |
174 |
|
|
8.3.3 Gap Functions |
176 |
|
|
8.3.4 Systolic DP Computation |
176 |
|
|
8.3.5 Backtracking |
177 |
|
|
8.3.6 Modulo Encoding |
179 |
|
|
8.3.7 FPGA Implementations |
180 |
|
|
8.4 Seed-Based Heuristics |
180 |
|
|
8.4.1 Filtering, Heuristics, and Quality Values |
181 |
|
|
8.4.2 BLAST: a 3-Stages Heuristic |
181 |
|
|
8.4.3 Seed Indexing |
182 |
|
|
8.4.4 FPGA Implementations |
184 |
|
|
8.5 Profiles, HMMs and Language Models |
184 |
|
|
8.5.1 Position-Dependent Pro.les |
184 |
|
|
8.5.2 Hidden Markov Models |
185 |
|
|
8.5.3 Language Models |
186 |
|
|
8.6 Bioinformatics FPGA Accelerators |
187 |
|
|
8.6.1 Splash |
188 |
|
|
8.6.2 Perle |
188 |
|
|
8.6.3 GenStorm |
188 |
|
|
8.6.4 RDisk |
188 |
|
|
8.6.5 BioXL/H |
191 |
|
|
8.6.6 DeCypher |
191 |
|
|
8.7 Summary |
191 |
|
|
9 Supercomputing Applications |
193 |
|
|
9.1 Introduction |
193 |
|
|
9.2 Monte Carlo Simulation of Radiative Heat Transfer |
194 |
|
|
9.2.1 Algorithm Description |
195 |
|
|
9.2.2 Hardware Implementation |
197 |
|
|
9.2.3 Performance |
198 |
|
|
9.3 Urban Road Traffic Simulation |
202 |
|
|
9.3.1 CA Traffic Modeling |
203 |
|
|
9.3.2 Intersections and Global Behavior |
204 |
|
|
9.3.3 Constructive Approach |
206 |
|
|
9.3.4 Streaming Approach |
208 |
|
|
9.4 Summary |
213 |
|
|
References |
215 |
|
|
Index |
243 |
|