HON95 2 лет назад
Родитель
Сommit
c166cf93b2
6 измененных файлов с 83 добавлено и 11 удалено
  1. 3 2
      index.md
  2. 1 1
      se/cpp.md
  3. 1 1
      se/go.md
  4. 32 7
      se/openmp.md
  5. 42 0
      se/rust.md
  6. 4 0
      virt-cont/docker.md

+ 3 - 2
index.md

@@ -137,13 +137,14 @@ _(Alphabetically sorted, so the ordering might seem a bit strange.)_
 
 - [C/C++ Tools](/se/ccpp-tools/)
 - [Clang/LLVM](/se/clang-llvm/)
-- [C++](/se/cpp/)
+- [C++ (Language)](/se/cpp/)
 - [Data Stuff](/se/data/)
 - [Database Management Systems (DBMSes)](/se/dbmses/)
 - [GNU Compiler Collection (GCC)](/se/gcc/)
-- [CUDA](/se/go/)
+- [Go (Language)](/se/go/)
 - [Licensing](/se/licensing/)
 - [OpenMP](/se/openmp/)
+- [Rust (Language)](/se/rust/)
 - [Web Security](/se/web-security/)
 
 ## Services

+ 1 - 1
se/cpp.md

@@ -1,5 +1,5 @@
 ---
-title: C++
+title: C++ (Language)
 breadcrumbs:
 - title: Software Engineering
 ---

+ 1 - 1
se/go.md

@@ -1,5 +1,5 @@
 ---
-title: CUDA
+title: Go (Language)
 breadcrumbs:
 - title: Software Engineering
 ---

+ 32 - 7
se/openmp.md

@@ -11,14 +11,32 @@ breadcrumbs:
 
 ## Target Offloading
 
-- **TODO** `target/device/teams/distribute/etc`
-- Declare/define target function: Add `#pragma omp begin declare target` before and `#pragma omp end declare target` after. It can now be used by both host and target.
+### Programming
+
+**TODO** Cleanup.
+
+- **TODO** `distribute`, `target data`
+- For NVIDIA GPUs, OpenMP _teams_ are similar to _blocks_ and map to _SMs_, while OpenMP _threads_ (within teams) map to _CUDA cores_ (within SMs).
+- `target`: Run the region on a device (GPU etc.). Only a single thread will be run if nothing more is specified. Often combined directly with `teams parallel`.
+- `teams`: Spawn a league of teams (like CUDA blocks).
+    - Each team will have an _initial_ thread which will execute the region.
+    - Must be combined with or nested directly within a target region.
+    - **TODO** Does this actually run the region with one thread in all teams?
+- `parallel` (with `target`): Spawn threads withing the teams (like CUDA threads within the blocks).
+    - Makes all threads within the teams execute the region.
+    - May e.g. be specified for certain regions within a `target teams` region to control which parts should run with all threads and which should only be run by initial threads.
+- Use `barrier` within parallel regions to synchronize.
+- Use `target update ...` to update variables to/from device while inside a target region.
+- Declare/define target function: Add `begin declare target` before and `#pragma omp end declare target` after. It can now be used by both host and target.
 - Try to avoid using library math functions as they may contain a lot of CPU-specific code like AVX-instructions which won't work in offloaded regions.
 - The host waits for target regions to finish. To run it asynchronously instead (as a task), specify `nowait`.
-- `depend(in/out: <var>)` may be used to declare variable dependencies for regions, mainly for use with tasks and `nowait` target regions.
-- Run region with a set number of teams (aka blocks in CUDA) and threads:
+- `depend(in/out: <var>)` may be used to declare variable dependencies for regions, mainly for use with tasks (like `nowait` target regions).
+
+#### Examples
+
+- Run region with a set number of teams and threads:
     ```c
-    // CUDA-equivalent: compute_stucc<<<1, 4>>>(args)
+    // CUDA-equivalent: compute_stuff<<<1, 4>>>(args)
     #pragma omp target teams num_teams(1)
     {
         before_stuff();
@@ -29,7 +47,14 @@ breadcrumbs:
         after_stuff();
     }
     ```
-- Use `#pragma omp target update ...` to update variables to/from device.
-- `#pragma omp barrier` works inside target blocks too.
+
+### Building
+
+- For GPU-offloaded OpenMP support, compile with e.g. `-fopenmp -fopenmp-targets=nvptx64-nvidia-cuda -Xopenmp-target=nvptx64-nvidia-cuda -march=sm_86` (NVIDIA RTX 3090) or `-fopenmp -fopenmp-targets=amdgcn-amd-amdhsa -Xopenmp-target=amdgcn-amd-amdhsa -march=gfx1030` (AMD RX 6900 XT).
+- For useful OpenMP-aware optimization debug info, compile with `-Rpass=openmp-opt -Rpass-missed=openmp-opt`. Use `-Rpass-analysis=openmp-opt` too for even more info.
+
+### Miscellanea
+
+- Run with `LIBOMPTARGET_INFO=1` to show runtime info like when kernels are executed on the devices.
 
 {% include footer.md %}

+ 42 - 0
se/rust.md

@@ -0,0 +1,42 @@
+---
+title: Rust (Language)
+breadcrumbs:
+- title: Software Engineering
+---
+{% include header.md %}
+
+## Setup
+
+Use Rustup, which manages your Rust installation(s), makes upgrades easy, and lets you easily install different toolchains and stuff.
+
+## Commands
+
+- Build and run: `cargo run`
+- Build: `cargo build`
+    - Builds to `target/<profile>/<name>`
+- Test: `cargo test`
+- Clean: `cargo clean`
+    - Removes generated artefacts (the whole `target` directory by default).
+- Lint (Clippy): `cargo clippy`
+    - Note: Certain options (like `-D warnings`) must be placed after `--` in the command.
+    - Add `[--] -D warnings` to fail on any warnings.
+    - Add `[--] -A clippy::branches-sharing-code` (example) to ignore (allow) certain lints.
+    - Add `--all-targets --all-features` to test all targets and all features too.
+
+### Dependencies
+
+- Update dependencies: `cargo update`
+    - Make sure to manually update the your pinned versions in `Cargo.toml` before this.
+- Show dependency graph: `cargo tree`
+    - Show enabled features: `cargo tree -e features`
+    - Show packages being built multiple times (duplicates): `cargo tree -d`
+    - Show inverted tree for some package: `cargo tree -i <package>` (useful with e.g. `-e features`)
+
+### Miscellanea
+
+- Upgrade edition (e.g. from 2018 to 2021):
+    1. Run `cargo fix --edition` to automatically fix your code.
+    1. Change the `edition` in `config.toml` (e.g. from 2018 to 2021).
+    1. Run `cargo build` or `cargo test` to verify it worked.
+
+{% include footer.md %}

+ 4 - 0
virt-cont/docker.md

@@ -82,6 +82,10 @@ The toolkit is used for running CUDA applications within containers.
     - Publish network port on host: `-p <host-port>:<cont-port>[/udp]`
     - Mount volume: `-v <host-path>:<container-path>`
         - The host path must have a path prefix like `./` or `/` if it is a file/dir and not a named volume.
+- Images:
+    - Show manifest: `docker manifest inspect <name>:<tag>`
+        - This also shows which architectures the image supports, in a useful format.
+        - Images often support different architectures for different tags.
 - Cleanup:
     - Prune unused images: `docker image prune -a`
     - Prune unused volumes: `docker volume prune`