— Technical
Using insertFromSelect for bulk INSERT ... SELECT queries in Magento 2
30 October 2023 · 5 min read
Direct database writes come up more often than Magento purists like to admit. Data migrations, attribute backfills, performance-sensitive bulk operations — sometimes you need to write SQL directly, and when you do, you should write it the Magento way rather than constructing raw query strings.
insertFromSelect is the method for writing INSERT ... SELECT queries through Magento’s database adapter. It’s cleaner than string concatenation, handles quoting correctly, and makes the duplicate handling strategy explicit.
The pattern
Say you need to copy image attribute values to the small_image attribute for all products where small_image isn’t already set. The raw SQL looks like:
INSERT IGNORE INTO `catalog_product_entity_varchar`
(`attribute_id`, `store_id`, `value`, `row_id`)
SELECT 88, `store_id`, `value`, `row_id`
FROM `catalog_product_entity_varchar`
WHERE value != 'no_selection'
AND attribute_id = 87;
In Magento code:
<?php
declare(strict_types=1);
namespace YourVendor\YourModule\Model;
use Magento\Framework\App\ResourceConnection;
use Magento\Framework\DB\Adapter\AdapterInterface;
class AttributeBackfill
{
private const TABLE = 'catalog_product_entity_varchar';
public function __construct(
private readonly ResourceConnection $resourceConnection
) {}
public function execute(): void
{
$connection = $this->resourceConnection->getConnection();
$table = $connection->getTableName(self::TABLE);
$imageAttributeId = 87; // image
$smallImageAttributeId = 88; // small_image
$select = $connection->select()
->from(
$table,
[
new \Zend_Db_Expr($smallImageAttributeId),
'store_id',
'value',
'row_id',
]
)
->where("value != 'no_selection'")
->where('attribute_id = ?', $imageAttributeId);
$query = $connection->insertFromSelect(
$select,
$table,
['attribute_id', 'store_id', 'value', 'row_id'],
AdapterInterface::INSERT_IGNORE
);
$connection->query($query);
}
}
The duplicate handling constants
The fourth argument to insertFromSelect determines what happens when the INSERT hits a unique key constraint. The three options:
AdapterInterface::INSERT_IGNORE // = 2 — skip duplicates silently
AdapterInterface::INSERT_ON_DUPLICATE // = 1 — UPDATE the existing row (INSERT ... ON DUPLICATE KEY UPDATE)
AdapterInterface::REPLACE // = 4 — DELETE then INSERT
INSERT_IGNORE is the right choice when you’re backfilling and don’t want to touch rows that already exist. INSERT_ON_DUPLICATE is the right choice when you want to update existing rows with the new values. REPLACE is almost never the right choice — it deletes then re-inserts, which changes the row’s ID and can break foreign key relations.
The Zend_Db_Expr for literal values
When you’re SELECTing a literal value (like a hardcoded attribute ID) rather than a column name, wrap it in new \Zend_Db_Expr(...). Without it, Magento will try to quote it as a column name. This is a common source of malformed queries.
// Wrong — produces `87` quoted as an identifier
$select->from($table, [87, 'store_id', 'value']);
// Correct — produces the literal integer 87
$select->from($table, [new \Zend_Db_Expr(87), 'store_id', 'value']);
Always use getTableName()
Never hardcode the table name as a string. getTableName() applies the configured table prefix, which is often blank in development but frequently set in production deployments. Code that skips this will silently fail on any installation with a table prefix.
On Adobe Commerce vs Open Source
Note that row_id is an Adobe Commerce column — on Open Source, the equivalent column is entity_id. If you’re writing code that needs to run on both, use MetadataPool to resolve the correct link field rather than hardcoding row_id. The rest of this pattern applies equally to both editions.
For one-off data migrations this level of correctness is overkill, but for module code that ships to multiple environments it’s worth the extra five lines.
Savan Padaliya
Senior Engineering Consultant