Skip to content

Commit

Permalink
Merge pull request #2559 from oneapi-src/release/2025.0_AITools
Browse files Browse the repository at this point in the history
2025.0 AI Tools Release
  • Loading branch information
jimmytwei authored Dec 17, 2024
2 parents d7a4270 + 606c5b1 commit d3e1ac1
Show file tree
Hide file tree
Showing 16 changed files with 206 additions and 366 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -17,9 +17,9 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"# Genetic Algorithms on GPU using Intel Distribution of Python numba-dpex\n",
"# Genetic Algorithms on GPU using Intel Distribution of Python \n",
"\n",
"This code sample shows how to implement a basic genetic algorithm with Data Parallel Python using numba-dpex.\n",
"This code sample shows how to implement a basic genetic algorithm with Data Parallel Python using Data Parallel Extension of NumPy.\n",
"\n",
"## Genetic algorithms\n",
"\n",
Expand Down Expand Up @@ -98,7 +98,7 @@
"\n",
"### Simple evaluation method\n",
"\n",
"We are starting with a simple genome evaluation function. This will be our baseline and comparison for numba-dpex.\n",
"We are starting with a simple genome evaluation function. This will be our baseline and comparison for dpnp.\n",
"In this example, the fitness of an individual is computed by an arbitrary set of algebraic operations on the chromosome."
]
},
Expand Down Expand Up @@ -317,9 +317,9 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"## GPU execution using numba-dpex\n",
"## GPU execution using dpnp\n",
"\n",
"We need to start with new population initialization, as we want to perform the same operations but now on GPU using numba-dpex implementation.\n",
"We need to start with new population initialization, as we want to perform the same operations but now on GPU using dpnpx implementation.\n",
"\n",
"We are setting random seed the same as before to reproduce the results. "
]
Expand All @@ -344,11 +344,11 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"### Evaluation function using numba-dpex\n",
"### Evaluation function using Data Parallel Extension for NumPy\n",
"\n",
"The only par that differs form the standard implementation is the evaluation function.\n",
"The only part that differs form the standard implementation is the evaluation function.\n",
"\n",
"The most important part is to specify the index of the computation. This is the current index of the computed chromosomes. This serves as a loop function across all chromosomes."
"In this implementation we are taking benefit from vectorized operations. DPNP will automatically vectorize addition, substraction, multiplication operations, making them efficient and suitable for GPU acceleration."
]
},
{
Expand All @@ -364,31 +364,28 @@
},
"outputs": [],
"source": [
"import numba_dpex\n",
"from numba_dpex import kernel_api\n",
"import dpnp as dpnp\n",
"\n",
"@numba_dpex.kernel\n",
"def eval_genomes_sycl_kernel(item: kernel_api.Item, chromosomes, fitnesses, chrom_length):\n",
" pos = item.get_id(0)\n",
"def eval_genomes_dpnp(chromosomes_list, fitnesses):\n",
" num_loops = 3000\n",
" for i in range(num_loops):\n",
" fitnesses[pos] += chromosomes[pos*chrom_length + 1]\n",
" for i in range(num_loops):\n",
" fitnesses[pos] -= chromosomes[pos*chrom_length + 2]\n",
" for i in range(num_loops):\n",
" fitnesses[pos] += chromosomes[pos*chrom_length + 3]\n",
"\n",
" if (fitnesses[pos] < 0):\n",
" fitnesses[pos] = 0"
" # Calculate fitnesses using vectorized operations\n",
" fitnesses += chromosomes_list[:, 1] * num_loops\n",
" fitnesses -= chromosomes_list[:, 2] * num_loops\n",
" fitnesses += chromosomes_list[:, 3] * num_loops\n",
"\n",
" # Clip negative fitness values to zero\n",
" fitnesses = np.where(fitnesses < 0, 0, fitnesses)\n",
" return fitnesses"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now, we can measure the time to perform some generations of the Genetic Algorithm with Data Parallel Python Numba dpex. \n",
"Now, we can measure the time to perform some generations of the Genetic Algorithm with Data Parallel Python Extension for NumPy. \n",
"\n",
"Similarly like before, the time of the evaluation, creation of new generation and fitness wipe are measured for GPU execution. But first, we need to send all the chromosomes and fitnesses container to the chosen device. "
"Similarly like before, the time of the evaluation, creation of new generation and fitness wipe are measured for GPU execution. But first, we need to send all the chromosomes and fitnesses container to the chosen device - GPU. "
]
},
{
Expand All @@ -399,25 +396,26 @@
},
"outputs": [],
"source": [
"import dpnp\n",
"\n",
"print(\"SYCL:\")\n",
"start = time.time()\n",
"\n",
"# Genetic Algorithm on GPU\n",
"for i in range(num_generations):\n",
" print(\"Gen \" + str(i+1) + \"/\" + str(num_generations))\n",
" chromosomes_flat = chromosomes.flatten()\n",
" chromosomes_flat_dpctl = dpnp.asarray(chromosomes_flat, device=\"gpu\")\n",
" fitnesses_dpctl = dpnp.asarray(fitnesses, device=\"gpu\")\n",
" chromosomes_dpctl = chromosomes\n",
" fitnesses_dpctl = fitnesses\n",
" try:\n",
" chromosomes_dpctl = dpnp.asarray(chromosomes, device=\"gpu\")\n",
" fitnesses_dpctl = dpnp.asarray(fitnesses, device=\"gpu\")\n",
" except Exception:\n",
" print(\"GPU device is not available\")\n",
" \n",
" fitnesses = eval_genomes_dpnp(chromosomes, fitnesses)\n",
" \n",
" exec_range = kernel_api.Range(pop_size)\n",
" numba_dpex.call_kernel(eval_genomes_sycl_kernel, exec_range, chromosomes_flat_dpctl, fitnesses_dpctl, chrom_size)\n",
" fitnesses = dpnp.asnumpy(fitnesses_dpctl)\n",
" chromosomes = next_generation(chromosomes, fitnesses)\n",
" fitnesses = np.zeros(pop_size, dtype=np.float32)\n",
"\n",
"\n",
"end = time.time()\n",
"time_sycl = end-start\n",
"print(\"time elapsed: \" + str((time_sycl)))\n",
Expand Down Expand Up @@ -457,7 +455,7 @@
"\n",
"plt.figure()\n",
"plt.title(\"Time comparison\")\n",
"plt.bar([\"Numba_dpex\", \"without optimization\"], [time_sycl, time_cpu])\n",
"plt.bar([\"DPNP\", \"without optimization\"], [time_sycl, time_cpu])\n",
"\n",
"plt.show()"
]
Expand Down Expand Up @@ -546,7 +544,7 @@
"\n",
"The evaluate created generation we are calculating the full distance of the given path (chromosome). In this example, the lower the fitness value is, the better the chromosome. That's different from the general GA that we implemented.\n",
"\n",
"As in this example we are also using numba-dpex, we are using an index like before."
"As in the previous example dpnp will vectorize basic mathematical operations to take benefit from optimizations."
]
},
{
Expand All @@ -555,11 +553,11 @@
"metadata": {},
"outputs": [],
"source": [
"@numba_dpex.kernel\n",
"def eval_genomes_plain_TSP_SYCL(item: kernel_api.Item, chromosomes, fitnesses, distances, pop_length):\n",
" pos = item.get_id(1)\n",
" for j in range(pop_length-1):\n",
" fitnesses[pos] += distances[int(chromosomes[pos, j]), int(chromosomes[pos, j+1])]\n"
"def eval_genomes_plain_TSP_SYCL(chromosomes, fitnesses, distances, pop_length):\n",
" for pos in range(pop_length):\n",
" for j in range(chromosomes.shape[1]-1):\n",
" fitnesses[pos] += distances[int(chromosomes[pos, j]), int(chromosomes[pos, j+1])]\n",
" return fitnesses\n"
]
},
{
Expand Down Expand Up @@ -703,22 +701,26 @@
"source": [
"print(\"Traveling Salesman Problem:\")\n",
"\n",
"distances_dpctl = dpnp.asarray(distances, device=\"gpu\")\n",
"distances_dpnp = distances\n",
"try:\n",
" distances_dpnp = dpnp.asarray(distances, device=\"gpu\")\n",
"except Exception:\n",
" print(\"GPU device is not available\")\n",
"\n",
"# Genetic Algorithm on GPU\n",
"for i in range(num_generations):\n",
" print(\"Gen \" + str(i+1) + \"/\" + str(num_generations))\n",
" chromosomes_flat_dpctl = dpnp.asarray(chromosomes, device=\"gpu\")\n",
" fitnesses_dpctl = dpnp.asarray(fitnesses.copy(), device=\"gpu\")\n",
"\n",
" exec_range = kernel_api.Range(pop_size)\n",
" numba_dpex.call_kernel(eval_genomes_plain_TSP_SYCL, exec_range, chromosomes_flat_dpctl, fitnesses_dpctl, distances_dpctl, pop_size)\n",
" fitnesses = dpnp.asnumpy(fitnesses_dpctl)\n",
" chromosomes = next_generation_TSP(chromosomes, fitnesses)\n",
" chromosomes_dpnp = chromosomes\n",
" try:\n",
" chromosomes_dpnp = dpnp.asarray(chromosomes, device=\"gpu\")\n",
" except Exception:\n",
" print(\"GPU device is not available\")\n",
"\n",
" fitnesses = np.zeros(pop_size, dtype=np.float32)\n",
"\n",
"for i in range(len(chromosomes)):\n",
" for j in range(11):\n",
" fitnesses[i] += distances[int(chromosomes[i][j])][int(chromosomes[i][j+1])]\n",
" fitnesses = eval_genomes_plain_TSP_SYCL(chromosomes_dpnp, fitnesses, distances_dpnp, pop_size)\n",
" chromosomes = next_generation_TSP(chromosomes, fitnesses)\n",
"\n",
"fitness_pairs = []\n",
"\n",
Expand All @@ -736,7 +738,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"In this code sample, there was a general purpose Genetic Algorithm created and optimized using numba-dpex to run on GPU. Then the same approach was applied to the Traveling Salesman Problem."
"In this code sample, there was a general purpose Genetic Algorithm created and optimized using dpnp to run on GPU. Then the same approach was applied to the Traveling Salesman Problem."
]
},
{
Expand All @@ -756,7 +758,7 @@
"provenance": []
},
"kernelspec": {
"display_name": "base",
"display_name": "Base",
"language": "python",
"name": "base"
},
Expand All @@ -770,7 +772,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.18"
"version": "3.9.19"
}
},
"nbformat": 4,
Expand Down
Loading

0 comments on commit d3e1ac1

Please sign in to comment.