Ускорение реализации параллельной битонической сортировки

Ускорение реализации параллельной битонической сортировки ⇐ C++

1 сообщение • Страница 1 из 1

Anonymous

Ускорение реализации параллельной битонической сортировки

Цитата

Сообщение Anonymous » 31 окт 2024, 02:42

Я пытаюсь реализовать многопоточную версию битонической сортировки на ЦП, используя C++. На данный момент лучшее, что я могу получить с помощью этой реализации, — это ускорение примерно на 4,3 (при заказе массива в 128 МБ), даже когда я использую 16 потоков. Я использовал следующий код:

Код: Выделить всё

void compareAndSwap(std::vector& paddedValues, unsigned int threadId,
unsigned int chunkSize, unsigned int mergeStep, unsigned int bitonicSequenceSize)
{
unsigned int startIndex = threadId * chunkSize;
unsigned int endIndex = (threadId + 1) * chunkSize;

// Process the chunk assigned to this thread
for (unsigned int currentIndex = startIndex; currentIndex < endIndex; currentIndex++)
{
// Find the element to compare with
unsigned int compareIndex = currentIndex ^ mergeStep;

// Only compare if the compareIndex is greater (to avoid duplicate swaps)
if (compareIndex > currentIndex)
{
bool shouldSwap = false;

// Determine if we should swap based on the current subarray's sorting direction
if ((currentIndex & bitonicSequenceSize) == 0)  // First half of subarray (ascending)
{
shouldSwap = (paddedValues[currentIndex] > paddedValues[compareIndex]);
}
else  // Second half of subarray (descending)
{
shouldSwap = (paddedValues[currentIndex] < paddedValues[compareIndex]);
}

// Perform the swap if necessary
if (shouldSwap)
{
std::swap(paddedValues[currentIndex], paddedValues[compareIndex]);
}
}
}
}

void bitonicSort(uint32_t values[], unsigned int arrayLength, unsigned int numThreads, int sortOrder)
{
// Step 1: Pad the array to the next power of 2
unsigned int paddedLength = 1 

Подробнее здесь: [url]https://stackoverflow.com/questions/79143120/parallel-bitonic-sort-implementation-speedup[/url]

1730331732

Anonymous

Я пытаюсь реализовать многопоточную версию битонической сортировки на ЦП, используя C++. На данный момент лучшее, что я могу получить с помощью этой реализации, — это ускорение примерно на 4,3 (при заказе массива в 128 МБ), даже когда я использую 16 потоков. Я использовал следующий код:
[code]void compareAndSwap(std::vector& paddedValues, unsigned int threadId,
unsigned int chunkSize, unsigned int mergeStep, unsigned int bitonicSequenceSize)
{
unsigned int startIndex = threadId * chunkSize;
unsigned int endIndex = (threadId + 1) * chunkSize;

// Process the chunk assigned to this thread
for (unsigned int currentIndex = startIndex; currentIndex < endIndex; currentIndex++)
{
// Find the element to compare with
unsigned int compareIndex = currentIndex ^ mergeStep;

// Only compare if the compareIndex is greater (to avoid duplicate swaps)
if (compareIndex > currentIndex)
{
bool shouldSwap = false;

// Determine if we should swap based on the current subarray's sorting direction
if ((currentIndex & bitonicSequenceSize) == 0)  // First half of subarray (ascending)
{
shouldSwap = (paddedValues[currentIndex] > paddedValues[compareIndex]);
}
else  // Second half of subarray (descending)
{
shouldSwap = (paddedValues[currentIndex] < paddedValues[compareIndex]);
}

// Perform the swap if necessary
if (shouldSwap)
{
std::swap(paddedValues[currentIndex], paddedValues[compareIndex]);
}
}
}
}

void bitonicSort(uint32_t values[], unsigned int arrayLength, unsigned int numThreads, int sortOrder)
{
// Step 1: Pad the array to the next power of 2
unsigned int paddedLength = 1 

Подробнее здесь: [url]https://stackoverflow.com/questions/79143120/parallel-bitonic-sort-implementation-speedup[/url]