A.1. 改訂履歴

Intel® FPGA SDK for OpenCL™: ベスト・プラクティス・ガイド

ダウンロード PDF

ID 683521

日付 12/08/2017

バージョン 17.1

Public

A.1. 改訂履歴

表 12. 改訂履歴：スタート・ガイド
日付	バージョン	変更内容
2017年12月	2017.12.08	以下の新しいトピックが追加されました： Autorun Capturesタブ自動プロファイラーデータ
2017年11月	2017.11.06	すべてのトピックを個々の章に移動。トピックのタイトルの一部をタスクベースのタイトルに変更。 Fmaxのすべての出現箇所を f_maxに変更。ストール、占有、帯域幅に新しい短い説明を追加。への説明とともに、並列スレッドとループ・パイプライン処理の比較を示す新しい画像を追加。 FPGA概要での説明とともにFPGAアーキテクチャを追加た。ループパイプラインのメモリー依存性の最小化を追加。エリア情報の確認にエリアレポート階層の詳細を追加。チャネルとパイプのベスト・プラクティスを追加。アラインメントされたメモリーの割り当てを更新。 loop_coalesceを使用してネストループによって消費されるエリアの削減を追加。メモリー・アクセス・パターンの例の変更を追加。イメージを更新。以下のトピックでは、aocコマンドに対して単一ダッシュと`-option=<value>`の規則を実装。浮動小数点演算の最適化グローバルメモリーの手動分割キャッシュ・メモリーコンパイルに関する考慮事項高い失業率と高い占有率シングル・ワーク・アイテム・カーネル対NDRangeカーネルでは、デザインのための単一作業項目カーネルを作成するための基準を削除。新しいサンプルコードと関連説明を追加。単一作業項目の実行に関するサブトピックを削除し、その内容をこのトピックとマージ。
2017年5月	2017.05.08	コード例の一部の関数を次のように書き換え。 read_channel_alteraからread_channel_intelに write_channel_alteraからwrite_channel_intelに read_channel_nb_alteraからread_channel_nb_intelに write_channel_nb_alteraからwrite_channel_nb_intelにロード・ストア・ユニットを追加。 Report Summary のレビュー、を追加。 Kernel Memory Viewer の特長を追加。
2016年12月	2016.12.02	微細な編集上の更新。
2016年10月	2016.10.31	Altera SDK for OpenCLをインテル® FPGA SDK for OpenCL™ に変更。 Altera Offline Compilerを Intel® FPGA SDK for OpenCL™オフライン・コンパイラーに変更。 Align a Struct with or without Paddingでは、構造体宣言に関する属性の配置を修正するようにコードスニペットを修正。トピックReview Your Kernel's report.html Fileを追加し、HTML GUIを説明するサブトピック、GUIが提供するさまざまなレポート、およびHTMLレポートの情報を活用してOpenCLデザイン例を最適化する方法についてのチュートリアルを追加しました。カーネルのサブトピック、グローバルメモリーインターコネクト、ローカルメモリー、ネストループ、単一作業項目カーネル内のループ、およびチャネルが含まれるトピックHTML Report: Kernel Design Conceptsを追加。情報がHTMLレポートの一部になったため、Optimization Reportセクションと関連するサブセクションを削除しました。
2016年5月	2016.05.02	`ivdep`プラグマを紹介するために、トピックRemoving Loop-Carried Dependencies Caused by Accesses to Memory Arraysを追加。 Strategies for Improving Memory Access Efficiencyの下で、`numbanks`と`bankwidth`カーネル属性を使用してローカルメモリシステムのジオメトリを設定する方法を説明するために、以下のトピックを追加。 Improve Kernel Performance by Banking the Local Memory Optimize the Geometric Configuration of Local Memory Banks Based on Array Index Strategies for Improving Memory Access Efficiencyの下に、`singlepump`および`doublepump`カーネル属性の使用法を説明するためのトピックOptimize Accesses to Local Memory by Controlling the Memory Replication Factorを追加。拡張最適化レポートのメッセージを含むためにAddressing Single Work-Item Kernel Dependencies Based on Optimization Report Feedbackの下のサブセクションを更新。リソース使用量を確認するために拡張エリアレポートにアクセスする手順を含むために図Optimization Work Flow for a Single Work-Item Kernel を更新。 Strategies for Improving NDRange Kernel Data Processing Efficiencyの下に、Review Kernel Properties and Loop Unroll Status in the Optimization Reportセクションを追加。
2015年11月	2015.11.02	トピックMulti-Threaded Host Applicationを追加。 Specify a Maximum Work-Group Size or a Required Work-Group Sizeのメモリーバリアに関する注意書きを追加しました。
2015年5月	15.0.0	Memory Access Considerationsでは、Cyclone® Vデバイスをターゲットとするカーネルで__constantポインタ引数を宣言するときに発生する可能性があるパフォーマンス低下に関する注意書きを追加。 Good Design Practices for Single Work-Item Kernelでは、Initialize Data Prior to Usage in a Loopセクションを削除し、Declare Variables in the Deepest Scope Possibleセクションを追加。 Removing Loop-Carried Dependency by Inferring Shift Registersを追加。このトピックでは、単一の作業項目カーネルで、シフトレジスターとして倍精度浮動小数点配列を推論することで、ループキャリー依存関係を除去する方法について説明します。 Data Type ConsiderationsからData Type Selection Considerationsに変更。
2014年12月	14.1.0	.レポートメッセージと最適化レポートのレイアウトを更新するためにOptimization Report Messagesセクションの情報フローを再編成。失敗した最適以下のパイプライン実行の理由を詳述する新しい最適化レポートメッセージが含まれています。 Added the Optimization Report Messages for Simplified Analysis of a Complex Design subsection under Optimization Report Messages to describe new report message for simplified kernel analysis. 単純化されたカーネル分析のための新しいレポートメッセージを記述するためにOptimization Report Messagesの下にOptimization Report Messages for Simplified Analysis of a Complex Designサブセクションを追加。 Using Feedback from the Optimization Report to Address Single Work-Item Kernels DependenciesをAddressing Single Work-Item Kernel Dependencies Based on Optimization Report Feedbackに変更。ループ運搬依存関係を解決するための新しい戦略を説明するためにAddressing Single Work-Item Kernel Dependencies Based on Optimization Report Feedbackの下にTransferring Loop-Carried Dependency to Local Memoryサブセクションを追加。
2014年6月	14.0.0	文書の名前をインテル® FPGA SDK for OpenCL™ ベスト・プラクティス・ガイドに変更。インフォメーション・フローの再編。 Good Design PracticesからGood OpenCL Kernel Design Practicesに変更。 Transfer data via offline compilerL Channelsにチャンネル情報を追加。 Profile Your Kernel to Identify Performance Bottlenecksにプロファイラー情報を追加。 Single Work-Item Kernel Versus NDRange Kernelセクションを追加。 Single Work-Item Executionセクションを変更。 Performance Warning Messagesセクションを削除。 Single Work-Item Kernel Programming ConsiderationsからGood Design Practices for Single Work-Item Kernelに変更。
2013年12月	13.1.1	Specify a Maximum Work-Group Size or a Required Work-Group Sizeセクションを変更。 Heterogeneous Memory Buffersセクションを追加。 Single Work-Item Executionセクションを変更。 Performance Warning Messagesセクションを追加。 Single Work-Item Kernel Programming Considerations セクションを変更。
2013年11月	13.1.0	インフォメーション・フローの再編。インテル® FPGA SDK for OpenCL™ Compilation Flowセクションを変更。 Pipelines; inserted the figure Example Multistage Pipeline Diagramセクションを変更次の図を削除。 Instruction Flow through a Five-Stage Pipeline Processor. Vector Addition Kernel Compiled to an FPGA. Effect of Kernel Vectorization on Array Summation. Data Flow Implementation of a Four-Element Accumulation Kernel. Data Flow Implementation of a Four-Element Accumulation Kernel with Loop Unrolled. Complete Loop Unrolling. Unrolling Two Loop Iterations. Memory Master Interconnect. Local Memory Read and Write Ports. Local Memory Configuration. Good Design Practicesセクションを変更。次のセクションを削除。 Predicated Execution. Throughput Analysis. Case Studies. Optimizing Data Processing EfficiencyからOptimization of Data Processing Efficiencyに変更。 Replicating Compute Units versus Kernel SIMD VectorizationからCompute Unit Replication versus Kernel SIMD Vectorizationに変更。 Using num_compute_units and num_simd_work_items Togetherから Combination of Compute Unit Replication and Kernel SIMD Vectorizationに変更。 Memory StreamingからContiguous Memory Accessesに変更。
2013年6月	13.0 SP1.0	複雑な出口パスを含むOpenCLカーネル・ソース・コードのサポート状況を更新。 StoreとGlobal Memory間のデータフローを修正するために、図Effect of Kernel Vectorization on Array Summationを更新。 Loop Unrollingセクションの`unroll`プラグマ・ディレクティブの内容を更新。 Local Memoryセクションを変更。 Loop Unrolling with Vectorizationセクションを削除。図Optimizing Local Memory Bandwidthを削除。
2013年5月	13.0.1	用語を更新。たとえば、パイプラインは計算単位に置き換えられます。ベクターレーンはSIMDベクターレーンに置き換えられます。 Good Design Practicesの下に以下のセクションを追加。 Preprocessor Macros. Floating-Point versus Fixed-Point Representations. Recommended Optimization Methodology. Sequence of Optimization Techniques. コードの一部を更新。図Data Flow with Multiple Compute Units を変更。図Compute Unit Replication versus Kernel SIMD Vectorization を変更。図Optimizing Throughput Using Compute Unit Replication and SIMD Vectorizationを変更。図Memory Streamingを変更。図Local Memories Transferring Data Blocks within Matrices A and Bを追加。情報の流れを整理。図、表、および例の数を更新。新しいカーネル属性に関する情報を追加。 `max_share_resources` および `num_share_resources`
2013年5月	13.0.0	パイプラインの説明を更新。ケーススタディのコード例と結果の表を更新。図を更新。
2012年11月	12.1.0	初版。