Self-Quiz — Original Practice Questions

50 original exam-style questions. No dumps. Answers at the bottom.

Instructions: Set a 60-minute timer. Read each question fully before answering. Some questions are multiple-select (indicated).

Domain 1 — Essential AI Knowledge

1. Which NVIDIA framework is specifically designed for training, fine-tuning, and deploying large language models, automatic speech recognition, and text-to-speech models?

A) TensorRT
B) RAPIDS
C) NeMo
D) Triton Inference Server

2. A data scientist wants GPU-accelerated equivalents of pandas and scikit-learn to speed up data preprocessing on an NVIDIA GPU cluster. Which NVIDIA product addresses this?

A) NCCL
B) RAPIDS
C) TensorRT
D) DOCA

3. Which statement correctly describes the relationship between AI, ML, and Deep Learning?

A) ML is a subset of DL; DL is a subset of AI
B) AI is a subset of ML; ML is a subset of DL
C) DL is a subset of ML; ML is a subset of AI
D) All three are independent, parallel fields

4. What two resources do scaling laws (Chinchilla) indicate primarily determine the quality of a trained language model?

A) Number of layers and number of attention heads
B) Compute (FLOPS) and dataset size (tokens)
C) GPU memory and interconnect bandwidth
D) Batch size and learning rate

5. An organization needs to run LLM inference alongside professional visualization (ray tracing, rendering) on a single GPU server. Which GPU family best meets this requirement?

A) Hopper (H100)
B) Ada Lovelace (L40S)
C) Blackwell (B200)
D) Grace CPU

6. What is the primary purpose of TensorRT in an AI deployment pipeline?

A) Training large language models
B) Collective communications for distributed training
C) Optimizing trained models for inference on NVIDIA GPUs
D) Monitoring GPU health in production

7. Which statement about NVIDIA NGC is correct?

A) NGC is a paid cloud service for running AI training jobs
B) NGC is a free catalog of GPU-optimized containers and pre-trained models
C) NGC is NVIDIA’s BMC management platform
D) NGC requires an NVIDIA AI Enterprise license

8. During transformer model training, which operation consumes the most GPU memory?

A) Storing model weights in FP32
B) Storing all intermediate activations for backpropagation
C) Loading training data from storage
D) Gradient clipping

9. Which GPU is recommended for Virtual Desktop Infrastructure (VDI) workloads in the NVIDIA data center GPU use-case matrix?

A) H100
B) B200
C) L40S
D) L4

10. The NVIDIA B200 GPU introduced which new precision format not present in Hopper?

A) INT8
B) FP8
C) BF16
D) FP4

Domain 2 — AI Infrastructure

11. A 70B parameter model is to be served in FP16 precision. What is the minimum GPU memory required to hold the model weights?

A) 35 GB
B) 70 GB
C) 140 GB
D) 280 GB

12. What is the key advantage of scale-up (multi-GPU within one node) compared to scale-out (multi-node)?

A) Higher fault tolerance
B) Infinite horizontal scalability
C) Much higher inter-GPU bandwidth and lower latency (NVLink)
D) Lower cost per GPU

13. How many NVSwitch chips does the DGX H100 contain?

A) 2
B) 4
C) 8
D) 16

14. What does PUE of 1.15 indicate?

A) 15% of IT power is wasted on cooling
B) 15% of total power goes to IT equipment
C) Total facility power is 15% higher than IT equipment power
D) The facility is 15% below average efficiency

15. Which cooling method is required for racks exceeding 40–50 kW power density?

A) Hot-aisle/cold-aisle containment with CRAC units
B) Direct Liquid Cooling (DLC)
C) Rear-door fan walls
D) Immersion cooling only

16. What is the fundamental difference between E-W (East-West) and N-S (North-South) traffic in an AI cluster?

A) E-W is encrypted; N-S is plaintext
B) E-W is GPU-to-GPU training traffic requiring RDMA; N-S is user/management traffic using standard TCP
C) E-W uses Ethernet; N-S uses InfiniBand
D) E-W is higher latency than N-S

17. NVIDIA states the key measurement for an AI-optimized network is:

A) Peak bandwidth utilization
B) 99th percentile packet latency
C) How long an AI training job takes from start to finish
D) Total throughput in Gbps

18. What does RDMA’s “kernel bypass” feature provide?

A) Allows the OS kernel to skip memory validation for faster transfers
B) Enables data transfer without involving the OS kernel, reducing latency and CPU utilization
C) Bypasses encryption for lower overhead
D) Allows GPUs to skip PCIe and use NVLink for network transfers

19. Which NVIDIA products must work together to enable RoCE adaptive routing and congestion control?

A) ConnectX-8 SuperNIC + Quantum-X800 switch
B) BlueField-3 DPU + Spectrum-4 switch
C) H100 GPU + ConnectX-7 NIC
D) DOCA + NCCL

20. InfiniBand NDR provides what data rate per port?

A) 200 Gbps
B) 400 Gbps
C) 800 Gbps
D) 1,600 Gbps

21. What is the purpose of NVIDIA SHARP in the Quantum-X800 InfiniBand platform?

A) Adaptive routing to avoid congested switch ports
B) Hardware security acceleration for encrypted communications
C) Performing collective operations (e.g., all-reduce) inside the switch fabric
D) Compressing packets to increase effective bandwidth

22. Which storage type is most appropriate for active training datasets on a large GPU cluster with hundreds of concurrent data workers?

A) Local NVMe (per node)
B) NFS over 1GbE
C) Parallel distributed file system (Lustre, WEKA)
D) Object storage (S3-compatible)

23. The DPU’s “Isolate” function protects the infrastructure by:

A) Encrypting all data in transit between GPU and CPU
B) Moving infrastructure control/data plane to a separate DPU domain that operates independently if the host OS is compromised
C) Creating isolated VLAN segments for each tenant
D) Partitioning GPU memory into isolated regions

24. The GB300 NVL72 contains how many Blackwell Ultra GPUs and Grace CPUs?

A) 8 GPUs, 4 CPUs
B) 36 GPUs, 72 CPUs
C) 72 GPUs, 36 CPUs
D) 128 GPUs, 64 CPUs

25. What does “co-packaged optics” (CPO) enable in NVIDIA Photonics switches?

A) Running optical transceivers in a separate switch chassis
B) Integrating optical components directly in the switch ASIC package, reducing power and enabling higher port density
C) Connecting CPUs and GPUs via fiber instead of PCIe
D) Using photonic quantum computing for switching decisions

26. An AI Factory uses NVLink for scale-up and which technology for scale-out across nodes?

A) Standard Ethernet
B) InfiniBand
C) RoCE
D) PCIe Gen6

27. NVIDIA BlueField-3 provides how much networking bandwidth?

A) 100 GbE
B) 200 GbE
C) 400 GbE
D) 800 GbE

28. Which two factors determine GPU selection for an AI training workload? (select two)

A) GPU memory (model must fit)
B) Number of PCIe lanes available
C) Memory bandwidth
D) GPU height (1U vs 2U)

29. What is GPUDirect RDMA’s primary benefit?

A) Connects GPUs across data centers via fiber
B) Eliminates the GPU→CPU memory copy in the data transfer path, saving copy operations and reducing PCIe transactions
C) Provides encrypted GPU-to-GPU communication
D) Enables GPUs to directly access InfiniBand switches without a NIC

30. What type of data center management does a BMC provide?

A) In-band management via the production network
B) Out-of-band management independent of server OS state
C) Application-level monitoring via SNMP
D) Container orchestration

Domain 3 — AI Operations

31. What does DCGM stand for?

A) Data Compute GPU Metrics
B) Data Center GPU Manager
C) Distributed CUDA GPU Monitor
D) Device Configuration and GPU Management

32. In a Kubernetes cluster, what does the NVIDIA Device Plugin provide?

A) Physical GPU drivers for the host kernel
B) Exposes NVIDIA GPUs as schedulable Kubernetes resources (nvidia.com/gpu)
C) Monitors GPU utilization and sends alerts
D) Configures MIG partitions automatically

33. What is gang scheduling and why is it required for distributed AI training?

A) Scheduling jobs in priority order to maximize GPU utilization
B) Allocating all nodes of a job simultaneously before the job starts, preventing partial allocation deadlocks
C) Grouping similar jobs together to batch process them
D) Scheduling GPU operations in parallel within a single node

34. Which NVIDIA tool automatically deploys the entire GPU software stack on Kubernetes nodes?

A) DCGM
B) Triton Inference Server
C) NVIDIA GPU Operator
D) NVIDIA Base Command Platform

35. What is the maximum number of MIG instances on a single H100 80GB GPU?

A) 4
B) 7
C) 8
D) 16

36. Which virtualization method provides the strongest isolation between GPU tenants?

A) GPU passthrough
B) NVIDIA vGPU
C) MIG (Multi-Instance GPU)
D) Docker containers without virtualization

37. A training job shows GPU Utilization at 95% but Model FLOP Utilization (MFU) is only 20%. What is the most likely root cause?

A) Network bottleneck causing gradient exchange delays
B) Many small CUDA kernel launches with high overhead, rather than sustained Tensor Core compute
C) Insufficient GPU memory
D) Thermal throttling

38. The monitoring stack for GPU clusters typically flows in which order?

A) Grafana → Prometheus → DCGM Exporter → DCGM
B) DCGM → DCGM Exporter → Prometheus → Grafana
C) DCGM → Grafana → Prometheus → Alerts
D) Prometheus → DCGM → Grafana → DCGM Exporter

39. What ECC error type requires immediate investigation and potentially GPU page retirement?

A) Single-bit errors (SBE) — correctable
B) Double-bit errors (DBE) — uncorrectable
C) Triple-bit errors (TBE)
D) Parity errors

40. Which platform is NVIDIA’s purpose-built solution for managing DGX clusters, including job scheduling, software lifecycle, and monitoring?

A) NVIDIA Fleet Command
B) NVIDIA Base Command Platform
C) Kubernetes with GPU Operator
D) Slurm with DCGM

41. Which scheduler is most appropriate for managing large multi-node distributed training jobs in a research/HPC environment?

A) Kubernetes (vanilla)
B) Slurm
C) Docker Swarm
D) Apache Mesos

42. NVIDIA vGPU requires which software license?

A) No license — vGPU is included with GPU hardware
B) NVIDIA AI Enterprise
C) NVIDIA DGX software
D) CUDA Enterprise License

43. At what GPU temperature does thermal throttling typically begin?

A) 60°C
B) 75°C
C) 83–87°C
D) 100°C

44. What Slurm flag requests 4 GPUs for a training job?

A) --gpus=4
B) --gres=gpu:4
C) --resource=gpu:4
D) --accelerator=4

45. What is the primary purpose of Run:ai in a GPU cluster?

A) Replace DCGM for GPU health monitoring
B) Kubernetes-native GPU scheduling with AI-specific features: quotas, fractions, preemption
C) Serve inference via REST API
D) Manage InfiniBand switch configurations

Mixed / Integration Questions

46. An organization is deploying a multi-tenant GPU cluster for 20 data scientists. They need each scientist to have a guaranteed, isolated GPU resource allocation with no performance interference between users. The cluster uses H100 GPUs. Which virtualization method is best?

A) GPU passthrough (20 GPUs for 20 scientists)
B) vGPU with time-slicing
C) MIG with 2g.20gb profiles (3 instances per GPU, ~7 GPUs for 20 users with queuing)
D) Docker containers without virtualization

47. A training cluster has 99% GPU utilization but the team complains training is slow and gradient all-reduce is taking too long. The cluster uses 100 Gbps Ethernet. What should the infrastructure team investigate?

A) Increase GPU memory
B) Upgrade to InfiniBand or higher-bandwidth Ethernet (400+ Gbps)
C) Add more CPU cores
D) Increase local NVMe storage

48. Which statement about NVIDIA-Certified Systems is correct?

A) They are only available from NVIDIA directly
B) They validate a best baseline configuration for performance, security, and scale from partner OEMs
C) They require NVIDIA AI Enterprise license to operate
D) They are only available in PCIe form factor

49. A company needs to build an AI cluster that processes regulated healthcare data. The team wants maximum GPU performance for LLM training. Which deployment model is most appropriate?

A) Public cloud (AWS/Azure/GCP)
B) On-premises with DGX systems and InfiniBand
C) Hybrid: training in cloud, data stored on-prem
D) Edge deployment using Jetson modules

50. NVIDIA Spectrum-X is described as the “World’s First Ethernet Platform for AI.” What specific combination of technologies enables near-InfiniBand RDMA performance over standard Ethernet?

A) PCIe Gen6 + 800 Gbps Ethernet
B) RoCEv2 + NCCL-optimized adaptive routing + congestion control, requiring BlueField-3 DPU + Spectrum-4 switch
C) TCP over 100 Gbps Ethernet with hardware offload
D) InfiniBand protocol encapsulated in Ethernet frames

Answers

Q	A	Q	A	Q	A	Q	A	Q	A
1	C	11	C	21	C	31	B	41	B
2	B	12	C	22	C	32	B	42	B
3	C	13	B	23	B	33	B	43	C
4	B	14	C	24	C	34	C	44	B
5	B	15	B	25	B	35	B	45	B
6	C	16	B	26	B	36	C	46	C
7	B	17	C	27	C	37	B	47	B
8	B	18	B	28	A,C	38	B	48	B
9	D	19	B	29	B	39	B	49	B
10	D	20	B	30	B	40	B	50	B

Score interpretation

Score	Readiness
45–50	Exam ready — schedule it
40–44	Nearly ready — review weak domains
35–39	Good foundation — 1 more week of study
< 35	Focus on Domain 2 (networking + DPU) then retake

Q	A	Q	A	Q	A	Q	A	Q	A
1	C	11	C	21	C	31	B	41	B
2	B	12	C	22	C	32	B	42	B
3	C	13	B	23	B	33	B	43	C
4	B	14	C	24	C	34	C	44	B
5	B	15	B	25	B	35	B	45	B
6	C	16	B	26	B	36	C	46	C
7	B	17	C	27	C	37	B	47	B
8	B	18	B	28	A,C	38	B	48	B
9	D	19	B	29	B	39	B	49	B
10	D	20	B	30	B	40	B	50	B

Q	A	Q	A	Q	A	Q	A	Q	A
1	C	11	C	21	C	31	B	41	B
2	B	12	C	22	C	32	B	42	B
3	C	13	B	23	B	33	B	43	C
4	B	14	C	24	C	34	C	44	B
5	B	15	B	25	B	35	B	45	B
6	C	16	B	26	B	36	C	46	C
7	B	17	C	27	C	37	B	47	B
8	B	18	B	28	A,C	38	B	48	B
9	D	19	B	29	B	39	B	49	B
10	D	20	B	30	B	40	B	50	B

Q	A	Q	A	Q	A	Q	A	Q	A
1	C	11	C	21	C	31	B	41	B
2	B	12	C	22	C	32	B	42	B
3	C	13	B	23	B	33	B	43	C
4	B	14	C	24	C	34	C	44	B
5	B	15	B	25	B	35	B	45	B
6	C	16	B	26	B	36	C	46	C
7	B	17	C	27	C	37	B	47	B
8	B	18	B	28	A,C	38	B	48	B
9	D	19	B	29	B	39	B	49	B
10	D	20	B	30	B	40	B	50	B