Commons talk:Structured data

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search
SpBot archives all sections tagged with {{Section resolved|1=~~~~}} after 7 days.

Talk pages of subpages and archives

So this is essentially a front-end to Wikidata?

[edit]

Do I understand this correctly: every file on Commons is also a node (item?) on Wikidata, in the sense that it is assigned a QID, and thus you can start making statements about it — the structured data tab is essentially a front-end to the Wikidata database? The main difference to editing statements on Wikidata appears to be that this interface does not show the subject QID anywhere! Or are you using other URIs as resource identifiers? (Being explicit with such technical details would have helped me understanding what this was all about.)

Do you have any examples of statements relating one file to another? (I didn't encounter anything like that, browsing a few of the examples in the list of properties.) Practical application: relating different BSicons to each other. 88.129.117.158 14:59, 7 July 2024 (UTC)[reply]

No, each image does not have a QID. Each media item instead has an identifier consisting of "M" followed by its page ID, but as you say the interface mostly hides this. I don't think there's any way to relate files to one another directly; I've certainly never seen one. Notably, when Wikidata wants to refer to Commons files, it does it by name and not by using the "M" IDs. --bjh21 (talk) 15:31, 7 July 2024 (UTC)[reply]

Bot to remove duplicate statements

[edit]

Hi folks, sometimes we end up with duplicate statements on files. This often happens when bots edit the same file twice. I set up bot to clean that up about once a month (code). It goes through the latest dump at https://dumps.wikimedia.org/commonswiki/entities/ (the copy on Toolforge) and removes any duplicate statements. It hashes the statement for quick processing so it doesn't catch cases (yet) like this one where the qualifiers are in different order. Multichill (talk) 20:29, 7 July 2024 (UTC)[reply]

List of bots involved

[edit]

i think it'd be good for information, reference and documentation purposes to have a page that lists all bots and how they are involved in editing sdc. then we will know, whether a specific task in mind is already taken charge of, or if any bot malfunctions which tasks are not carried out, etc. RZuo (talk) 10:41, 6 August 2024 (UTC)[reply]

for example, i want to know now whether some bot is already in charge of migrating accidental addition of coordinate location (P625) to location of creation (P1071). RZuo (talk) 10:43, 6 August 2024 (UTC)[reply]
See Commons:Structured_data/Modeling#Bots --XRay 💬 10:55, 6 August 2024 (UTC)[reply]
thx a lot.
also i made a mistake above. should be moving p625 to coordinates of the point of view (P1259) instead of p1071. RZuo (talk) 11:57, 6 August 2024 (UTC)[reply]

Cannot find haswbstatement:P1259

[edit]

https://commons.wikimedia.org/w/index.php?search=haswbstatement:P1259 shows nothing despite File:Martin-Luther-Kirche Behringen 2024-04-09 03.jpg for example. why? this shouldnt be expected, right? RZuo (talk) 11:58, 6 August 2024 (UTC)[reply]

@RZuo: Ah yes, I remember looking at this before. I think CirrusSearch can only see properties whose values are text or Wikidata items. If you use the API to get cirrusbuilddoc for that page (API sandbox link) you'll find that statement_keywords contains source of file (P7482), copyright license (P275), copyright status (P6216), captured with (P4082), media type (P1163), and checksum (P4092), but not creator (P170), inception (P571), data size (P3575), or coordinates of the point of view (P1259). bjh21 (talk) 17:37, 6 August 2024 (UTC)[reply]
thx a lot for the tips! RZuo (talk) 19:36, 6 August 2024 (UTC)[reply]
You can have a look to the cirrusdump: [1] --XRay 💬 20:04, 6 August 2024 (UTC)[reply]

Inception for files where only Upload date is available

[edit]

Is there a preferred way to add SDC inception (P571) claims for files with unknown creation date, but known {{Upload date}}? As far as I see, there's no "upload date" qualifier on Wikidata. Is it ok to simply use the upload date for inception (P571)? Fl.schmitt (talk) 20:30, 22 August 2024 (UTC)[reply]

@Fl.schmitt: how about "unknown value" with qualifier latest date (P1326) and the upload date. - Jmabel ! talk 04:30, 23 August 2024 (UTC)[reply]
@Jmabel - that's a good solution! I've used it already at Karte Bodensee Birnau.png - and {{Information}} is able to use latest date (P1326) as default if there's no date parameter - great! I wasn't able to set "unknown date" manually, but pywikibot was able to create such a claim with snaktype "somevalue". Fl.schmitt (talk) 07:16, 23 August 2024 (UTC)[reply]

VIRIN property proposal

[edit]

See d:Wikidata:Property proposal/VIRIN. Multichill (talk) 13:27, 24 August 2024 (UTC)[reply]

It seems that there's no consensus yet on using OpenStreetMap way ID (P10689) (or OpenStreetMap relation ID (P402) and OpenStreetMap node ID (P11693), respectively) in SDC. Using those properties would allow using SPARQL queries based on OSM IDs, like https://w.wiki/B3sq or https://w.wiki/B3su - with the advantage that such a query could return multiple views of an OSM map feature (similar to Google Maps Images for a map object). What's your opinion on this proposal? Fl.schmitt (talk) 06:19, 30 August 2024 (UTC)[reply]

Better example: https://w.wiki/B3t6 ("Big Ben", London) Fl.schmitt (talk) 06:33, 30 August 2024 (UTC)[reply]
Query for current usage of P10689, grouped by OSM ID: https://w.wiki/B3tC Fl.schmitt (talk) 06:37, 30 August 2024 (UTC)[reply]
If i remember correctly, the opposition to using OSM identifiers in Wikidata was that the OSM identifiers weren't stable. The proposed method then was to add Wikidata items to OSM and link from OSM to Wikidata. This would also create a permanent identifier to OSM for an entity. However, the problem with this approach was that it is impossible to know in the wiki Lua/template code if there was anything on the OSM side, which is a problem when the template creates links to OSM or uses OSM location in maps. Soving this for inwiki use would require adding it to software (Lua/Wikidata) and afaik only workaround is to add these as properties. In SPARQL however images can be queried using federated queries ( https://w.wiki/B3tZ or https://w.wiki/B3tt), but with performance penalty. --Zache (talk) 07:05, 30 August 2024 (UTC)[reply]
Photos don’t have OSM way/relation/node IDs, only the depicted places have them. Therefore I don’t think these properties should be used directly in SDC: add them to the appropriate Wikidata item and link to that using depicts (P180). Those can also be queried, using federation:
#defaultView:ImageGrid
select ?place ?placeLabel ?thumb with {
  select * {
    service <https://query.wikidata.org/sparql> {
      bind('54486345' as ?way_id).
      ?place wdt:P10689 ?way_id;
             rdfs:label ?placeLabel.
      filter (lang(?placeLabel) = "en").
    }
  } 
} as %places where {
  include %places. 
  ?image wdt:P180 ?place;
         schema:url ?thumb.
}
Try it! (By the way, as you can see from these results, your example of finding the Big Ben / Elizabeth Tower by the ID https://www.openstreetmap.org/way/54486345 is wrong: it’s the nearby St Margaret’s Church. This doesn’t invalidate your example, but it does mean that the files your query finds have incorrect SDC and should be fixed.) —Tacsipacsi (talk) 13:35, 1 September 2024 (UTC)[reply]
@Tacsipacsi: Thanks a lot - that's in fact interesting, I didn't check the OSM ID for the Big Ben example. Good hint, I will look at it. Regarding "Photos don’t have OSM way/relation/node IDs, only the depicted places have them": Here I disagree - that's exactly the point of my question. Of course, there's no such thing like a 1:1 relation between a photo and it's object (insofar I agree). But there are 429,892 OSM tags which have Commons Files or Categories "attached". So while OSM entities have Commons content (and Categories/Files can reference OSM objects using {{On OSM}} or {{OSMLink}}), there seems to be a practical need for such relations. Usage Bot has collected more than 200,000 commons files used on OSM. I admit that OSM identifiers may not be stable, but is this a practical problem? This looks like a standard maintenance task to be done by a bot - periodically check such ID references for validity.
Regarding depicts (P180): Using that property in SDC works only in cases if there's a wikidata item as target. But on OSM, the wikimedia_commons or image attributes are used on objects like hiking sign posts, wayside shrines or other "non-notable" things. Especially for hiking sign posts (but also for fountains, sculptures and other 3d objects), it would be very useful to have multiple images available, showing different perspectives of that object. On OSM, there's currently no "recommended" way to reference multiple images, leading to an incoherent use of the respective tags. Assigning an OSM node ID to a Commons file would allow a 1:n relation between OSM objects and Commons files. Fl.schmitt (talk) 17:34, 1 September 2024 (UTC)[reply]

Test environment for SDC?

[edit]

Is there a installation for testing SDC? I found https://test.wikipedia.org but that doesn't have SDC. https://test-commons.wikimedia.org seems to be closed.
 ∞∞ Enhancing999 (talk) 19:06, 2 September 2024 (UTC)[reply]

On https://commons.wikimedia.beta.wmflabs.org many testing should be possible but you might need to request rights on Beta Wikidata to create properties. GPSLeo (talk) 19:31, 2 September 2024 (UTC)[reply]

YouTube SDC

[edit]

Please see Commons:Bots/Work requests#Add P1651 YouTube video ID structured data from "source" attribute of Filedesc template for the sample schema, which is more or less identical to Commons:Structured data/Modeling#Upload from a platform like Panoramio, Geograph or Flickr -- DaxServer (talk) 10:32, 13 September 2024 (UTC)[reply]