
Recommender systems (RS) help users mitigate information overload by suggesting relevant items in a personalized way. In recent years, due to the advances in deep learning, visually-aware recommenders have attracted increased research interest. Such systems combine information contained in images, represented as high-level feature vectors, with collaborative signals in a hybrid recommendation approach. Since item catalogs can be huge, recommendation service providers often rely on pre-trained image models such as ResNet and on images that are supplied by the item providers, e.g., through an API. As a result of this latter aspect, the providers of the images influence what is presented to users through the recommendations. In this work, we show that relying on such external sources can make an RS vulnerable to attacks, where the goal of the attacker is to unfairly promote certain pushed items. Specifically, we demonstrate how a new visual attack model can effectively influence the item scores and rankings in a black-box approach, i.e., without knowing the parameters of the model. The main underlying idea is to systematically create small human-imperceptible perturbations of the pushed item image and to devise appropriate gradient approximation methods to incrementally raise the score of the pushed item. Experimental evaluations on two datasets show that the novel attack model is effective even when the contribution of the visual features to the overall performance of the recommender system are modest.