本文共 37870 字,大约阅读时间需要 126 分钟。
pluribus算法
By default, the content of an OpenStack Swift object cannot be greater than 5 GB. However, you can use a number of smaller objects to construct a large object via the concept of segmentation. From , “Segments of the larger object are uploaded and a special manifest file is created that, when downloaded, sends all the segments concatenated as a single object.” This “user manifest” design exists in order to provide a transparent download of large objects to the client and still provide the uploading client with a clean API to support segmented uploads.
默认情况下,OpenStack Swift对象的内容不能大于5 GB。 但是,可以通过分段的概念使用多个较小的对象来构造较大的对象。 从 ,“将上载较大对象的段,并创建一个特殊的清单文件,下载该文件后,将所有并置为单个对象的段发送出去。” 存在这种“用户清单”设计是为了向客户端提供大型对象的透明下载,并且仍为上载的客户端提供干净的API以支持分段上载。
在处理大型数据集时,我们遇到了围绕精确机制的挑战,以将14Gb文件表示为IBM®Object Storage forBluemix®中的单个实体。 这篇博客文章分享了我们从OpenStack Swift Manifest对象的创建中学到的知识。
Background: A 3rd party uploaded 61 separate files (segment files) into our IBM Bluemix Object Storage container, but failed to upload a corresponding manifest file. Instead, they shared a manifest file that outlined the details for each HTTP PUT request with no further context on what it was or how to use it. The contents of the file were similar to …
背景 :第三方将61个单独的文件(段文件)上载到我们的IBM Bluemix Object Storage容器中,但是未能上载相应的清单文件。 相反,他们共享一个清单文件,该文件概述了每个HTTP PUT请求的详细信息,而没有关于它是什么或如何使用的进一步上下文。 该文件的内容类似于…
1 1 2 2 3 3 4 4 5 5 6 6 | { ‘path’: ‘/somecontainer/someprefix-NjT2OURYBq’, ‘etag’: ‘ebc7d0d4718d8513fd5cdcf76de66f2a’, ‘size_bytes’: 234003629}, { ‘path’ : ‘/somecontainer/someprefix-NjT2OURYBq’ , ‘etag’ : ‘ebc7d0d4718d8513fd5cdcf76de66f2a’ , ‘size_bytes’ : 234003629 } , { ‘path’: ‘/somecontainer/someprefix-zVliDpHox4’, ‘etag’: ‘2814e177b9371770caf13902d6587373’, ‘size_bytes’: 234521937}, { ‘path’ : ‘/somecontainer/someprefix-zVliDpHox4’ , ‘etag’ : ‘2814e177b9371770caf13902d6587373’ , ‘size_bytes’ : 234521937 } , { ‘path’: ‘/somecontainer/someprefix-5lHhJcyjEX’, ‘etag’: ‘843fbdfb493b484b035436e0bb782560’, ‘size_bytes’: 241395892}, { ‘path’ : ‘/somecontainer/someprefix-5lHhJcyjEX’ , ‘etag’ : ‘843fbdfb493b484b035436e0bb782560’ , ‘size_bytes’ : 241395892 } , { ‘path’: ‘/somecontainer/someprefix-Q7xSsBprGK’, ‘etag’: ’05d09e28c8994cf5f9833c9dee6494a7′, ‘size_bytes’: 237095501}, { ‘path’ : ‘/somecontainer/someprefix-Q7xSsBprGK’ , ‘etag’ : ’05d09e28c8994cf5f9833c9dee6494a7′ , ‘size_bytes’ : 237095501 } , { ‘path’: ‘/somecontainer/someprefix-8pQIF4w1GR’, ‘etag’: ‘e0d912fc4b88961c33ecfe70e64a7855’, ‘size_bytes’: 226289048}, { ‘path’ : ‘/somecontainer/someprefix-8pQIF4w1GR’ , ‘etag’ : ‘e0d912fc4b88961c33ecfe70e64a7855’ , ‘size_bytes’ : 226289048 } , ... . . . |
Our Challenge: Referencing 61 individual files within our Jupyter Notebook seemed wrong. We wanted to pull in the entirety of the data by referencing a single Openstack swift url (e.g. swift://foo/man/… ) and without having to re-upload the entire series of files again. We suspected that the provided manifest file would prove useful, but had difficulty finding easy steps on using it in conjunction with OpenStack Swift and the IBM Bluemix Object Storage service. We were largely ignorant of how OpenStack Large Object support worked and how to use OpenStack Swift Manifest Objects. Sooo … here is our journey in the spirit of sharing
我们的挑战 :在我们的 Jupyter Notebook中引用61个单独的文件似乎是错误的。 我们希望通过引用单个Openstack swift网址(例如swift:// foo / man /…)来提取整个数据,而不必再次重新上传整个文件系列。 我们怀疑提供的清单文件可能有用,但是很难找到与OpenStack Swift和IBM Bluemix Object Storage服务结合使用的简单步骤。 我们在很大程度上不了解OpenStack大对象支持的工作方式以及如何使用OpenStack Swift清单对象。 太棒了……这是我们本着分享精神的旅程
Options: IBM Object Storage for Bluemix provides you with access to a fully provisioned OpenStack Object Storage (Swift) account to manage your data. IBM Object Storage for Bluemix uses OpenStack Identity (Keystone) for authentication and can be accessed directly by using Swift Object Storage API v1 calls. OpenStack Large Object Support is enabled and available for the IBM Object Storage for Bluemix service. But don’t take my word for it … issuing a HTTP GET request to the /info endpoint [] confirms this via the presence of a slo section. To support as many use cases as possible, OpenStack swift supports two (2) flavors:
选项 :IBM Object Storage for Bluemix使您可以访问完全配置的OpenStack对象存储(Swift)帐户来管理数据。 用于Bluemix的IBM Object Storage使用OpenStack身份验证(Keystone)进行认证,可以通过使用Swift Object Storage API v1调用直接访问。 OpenStack大对象支持已启用,并且可用于IBM Object Storage for Bluemix服务。 但是请不要相信我的意思……向/ info端点发出HTTP GET请求[ ]通过slo节的存在来确认这一点。 为了支持尽可能多的用例,OpenStack swift支持两(2)种形式:
Reader Tip: Consider jumping to the section if time is short and you’re looking to solve the happy path (e.g. Need to upload a local >5 Gb file into IBM Bluemix Object Storage based on OpenStack swift).
读者提示 :如果时间很短并且您正在寻找解决问题的路径,请考虑跳到“ 部分(例如,需要基于OpenStack swift将本地> 5 Gb文件上传到IBM Bluemix Object Storage)。
Game Plan:
游戏计划 :
Mechanics to Solve Our Challenge:
解决挑战的机制 :
name Credentials-1
$ cf service-keys {name_of_your_object_storage_service} Credentials-1 Getting key Credentials-1 for service instance {name_of_your_object_storage_service} as {your_username}…
{
“auth_url”: “https://identity.open.softlayer.com”,
“domainId”: “nice_long_hex_value”,
“domainName”: “some_number”,
“password”: “not_gonna_tell_you”,
“project”: “object_storage_hex_value”,
“projectId”: “project_hex_value”,
“region”: “dallas”,
“userId”: “another_fine_hex_value”,
“username”: “some_text_with_hex_values” }
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 | $ cf service–keys { name_of_your_object_storage_service} Getting keys for service instance { name_of_your_object_storage_service} as { your_username}... name Credentials–1 $ cf service–keys { name_of_your_object_storage_service} Credentials–1 Getting key Credentials–1 for service instance { name_of_your_object_storage_service} as { your_username}... { “auth_url”: “https://identity.open.softlayer.com”, “domainId”: “nice_long_hex_value”, “domainName”: “some_number”, “password”: “not_gonna_tell_you”, “project”: “object_storage_hex_value”, “projectId”: “project_hex_value”, “region”: “dallas”, “userId”: “another_fine_hex_value”, “username”: “some_text_with_hex_values” } |
Step 1 Complete!
This can be accomplished with a variety of tools ranging from Google Chrome Postman to curl.For example, …
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | $ curl –X POST –H “Content-Type: application/json” –H “Cache-Control: no-cache” –d ‘{ “auth”: { “identity”: { “methods”: [ “password” ], “password”: { “user”: { “id”: “another_fine_hex_value”, “password”: “not_gonna_tell_you” } } }, “scope”: { “project”: { “id”: “project_hex_value” } } } }‘ “https://identity.open.softlayer.com/v3/auth/tokens” |
This should result in a 500+ Line JSON Response BODY similar to …
Specifically, we want to identify the Swift Object Storage API url
1 | https://dal.objectstorage.open.softlayer.com/v1/AUTH_some-hex-value |
linked to your desired object storage region (dallas, london, …) and associated with a public interface. This will be found within the endpoints section which includes the name “swift”. This is illustrated in the highlighted lines of the JSON Response body above. Even more importantly, within the generated HTTP Response Header of this /v3/auth/tokens call is an authentication token that we also need to record to facilitate subsequent authenticated HTTP API calls.
Here is a sample of the HTTP Response Headers
The X-Subject-Token is the important response header. Its value will be reused within all subsequent HTTP Request Headers using the header X-Auth-Token. Obvious, right?
Step 2 Complete!
A carefully crafted HTTP PUT request needs to be made to the Swift Object Storage API Url which includes a valid X-Auth-Token request header, a query string parameter named multipart-manifest with an assigned value of “put” and a valid body containing an array of dict objects that represent a single manifest of all segmented files:
[{‘path’: ‘/somecontainer/someprefix-zVliDpHox4’, ‘etag’: ‘2814e177b9371770caf13902d6587373’, ‘size_bytes’: 234521937}, {‘path’: ‘/somecontainer/someprefix-5lHhJcyjEX’, ‘etag’: ‘843fbdfb493b484b035436e0bb782560’, ‘size_bytes’: 241395892}, {‘path’: ‘/somecontainer/someprefix-Q7xSsBprGK’, ‘etag’: ’05d09e28c8994cf5f9833c9dee6494a7′, ‘size_bytes’: 237095501}, {‘path’: ‘/somecontainer/someprefix-8pQIF4w1GR’, ‘etag’: ‘e0d912fc4b88961c33ecfe70e64a7855’, ‘size_bytes’: 226289048}, …]
1 2 3 4 5 6 7 8 9 10 11 12 | PUT /v1/AUTH_some–hex–value/name_of_any_existing_container/name_of_file_with_any_extension?multipart–manifest=put HTTP/1.1 Host: dal.objectstorage.open.softlayer.com Content–Type: text/csv X–Auth–Token: value–obtained–from–X–Subject–Token–Response–Header Cache–Control: no–cache [{ ‘path’: ‘/somecontainer/someprefix-zVliDpHox4’, ‘etag’: ‘2814e177b9371770caf13902d6587373’, ‘size_bytes’: 234521937}, { ‘path’: ‘/somecontainer/someprefix-5lHhJcyjEX’, ‘etag’: ‘843fbdfb493b484b035436e0bb782560’, ‘size_bytes’: 241395892}, { ‘path’: ‘/somecontainer/someprefix-Q7xSsBprGK’, ‘etag’: ’05d09e28c8994cf5f9833c9dee6494a7′, ‘size_bytes’: 237095501}, { ‘path’: ‘/somecontainer/someprefix-8pQIF4w1GR’, ‘etag’: ‘e0d912fc4b88961c33ecfe70e64a7855’, ‘size_bytes’: 226289048}, ...] |
or via curl …
If all goes well, an HTTP Response Code of 201 should be returned. To validate, you can open your IBM Bluemix Object Storage Service dashboard and observe creation of the “name_of_file_with_any_extension” manifest file within the name_of_any_existing_container. It should show an aggregated size which matches the sum of all segmented files. This new manifest file can now be singularly referenced and represents a collection of the 61 individual segment files. For example, within a Jupyter notebook we loaded the data using syntax similar to “swift://name_of_any_existing_container.spark/name_of_file_with_any_extension”. Sweet!
A carefully crafted HTTP PUT request needs to be made to the Swift Object Storage API Url which includes a valid X-Auth-Token request header, a required request header named X-Object-Manifest and an optional Content-Length request header with a value of 0:
1 2 3 4 5 6 7 8 | PUT /v1/AUTH_some–hex–value/name_of_any_existing_container/name_of_file_with_any_extension HTTP/1.1 Host: dal.objectstorage.open.softlayer.com Content–Type: application/json X–Auth–Token: value–obtained–from–X–Subject–Token–Response–Header Content–Length: 0 X–Object–Manifest: name_of_container_which_holds_the_segmented_files/common_prefix_label_to_match_against_for_all_segmented_files Cache–Control: no–cache |
or via curl …
If all goes well, an HTTP Response Code of 201 should be returned. To validate, this new zero-byte sized manifest file can now be singularly referenced and represents a collection of the 61 individual segment files. For example, within a Jupyter notebook we loaded the data using syntax similar to “swift://name_of_any_existing_container.spark/name_of_file_with_any_extension”. What’s really cool about this approach is that in the future we could choose to upload a 62nd segment file into the same container area and if we follow the common prefix label provided earlier within the X-Object-Manifest header – then our manifest will magically auto-include the new data with no additional editing of the manifest itself. Dynamic indeed!
Mission accomplished!
name Credentials-1
$ cf service-keys {name_of_your_object_storage_service} Credentials-1 Getting key Credentials-1 for service instance {name_of_your_object_storage_service} as {your_username}…
{
“auth_url”: “https://identity.open.softlayer.com”,
“domainId”: “nice_long_hex_value”,
“domainName”: “some_number”,
“password”: “not_gonna_tell_you”,
“project”: “object_storage_hex_value”,
“projectId”: “project_hex_value”,
“region”: “dallas”,
“userId”: “another_fine_hex_value”,
“username”: “some_text_with_hex_values” }
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 | $ cf service – keys { name_of_your_object_storage_service } Getting keys for service instance { name_of_your_object_storage_service } as { your_username } . . . name Credentials – 1 $ cf service – keys { name_of_your_object_storage_service } Credentials – 1 Getting key Credentials – 1 for service instance { name_of_your_object_storage_service } as { your_username } . . . { “auth_url” : “https://identity.open.softlayer.com” , “domainId” : “nice_long_hex_value” , “domainName” : “some_number” , “password” : “not_gonna_tell_you” , “project” : “object_storage_hex_value” , “projectId” : “project_hex_value” , “region” : “dallas” , “userId” : “another_fine_hex_value” , “username” : “some_text_with_hex_values” } |
步骤1完成!
这可以通过从Google Chrome Postman到curl的各种工具来完成。例如,…
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | $ curl – X POST – H “Content-Type: application/json” – H “Cache-Control: no-cache” – d ‘ { “auth” : { “identity” : { “methods” : [ “password” ] , “password” : { “user” : { “id” : “another_fine_hex_value” , “password” : “not_gonna_tell_you” } } } , “scope” : { “project” : { “id” : “project_hex_value” } } } } ‘ “https://identity.open.softlayer.com/v3/auth/tokens” |
这将导致500+行JSON响应正文,类似于…
具体来说,我们要确定Swift Object Storage API 网址
1 | https : //dal.objectstorage.open.softlayer.com/v1/AUTH_some-hex-value |
链接到所需的对象存储区域(达拉斯,伦敦等),并与公共接口关联。 这可以在端点部分找到,其中包括名称“ swift”。 上面的JSON响应正文的突出显示的行中对此进行了说明。 甚至更重要的是,在此/ v3 / auth / tokens调用的生成的HTTP响应标头中,还有一个身份验证令牌,我们还需要记录该身份验证令牌以方便后续的经过身份验证的HTTP API调用。
这是HTTP响应标头的示例
X-Subject-Token是重要的响应头。 其值将使用标头X-Auth-Token在所有后续HTTP请求标头中重用。 很明显吧?
步骤2完成!
需要对Swift Object Storage API Url进行精心设计的HTTP PUT请求,其中包括有效的X-Auth-Token请求标头,名为multipart-manifest的查询字符串参数(分配值为“ put”)和包含以下内容的有效主体代表所有分段文件的单个清单的dict对象数组:
[{‘path’: ‘/somecontainer/someprefix-zVliDpHox4’, ‘etag’: ‘2814e177b9371770caf13902d6587373’, ‘size_bytes’: 234521937}, {‘path’: ‘/somecontainer/someprefix-5lHhJcyjEX’, ‘etag’: ‘843fbdfb493b484b035436e0bb782560’, ‘size_bytes’: 241395892}, {‘path’: ‘/somecontainer/someprefix-Q7xSsBprGK’, ‘etag’: ’05d09e28c8994cf5f9833c9dee6494a7′, ‘size_bytes’: 237095501}, {‘path’: ‘/somecontainer/someprefix-8pQIF4w1GR’, ‘etag’: ‘e0d912fc4b88961c33ecfe70e64a7855’, ‘size_bytes’: 226289048}, …]
1 2 3 4 5 6 7 8 9 10 11 12 | PUT / v1 / AUTH_some – hex – value / name_of_any_existing_container / name_of_file_with_any_extension ? multipart – manifest = put HTTP / 1.1 Host : dal . objectstorage . open . softlayer . com Content – Type : text / csv X – Auth – Token : value – obtained – from – X – Subject – Token – Response – Header Cache – Control : no – cache [ { ‘path’ : ‘/somecontainer/someprefix-zVliDpHox4’ , ‘etag’ : ‘2814e177b9371770caf13902d6587373’ , ‘size_bytes’ : 234521937 } , { ‘path’ : ‘/somecontainer/someprefix-5lHhJcyjEX’ , ‘etag’ : ‘843fbdfb493b484b035436e0bb782560’ , ‘size_bytes’ : 241395892 } , { ‘path’ : ‘/somecontainer/someprefix-Q7xSsBprGK’ , ‘etag’ : ’05d09e28c8994cf5f9833c9dee6494a7′ , ‘size_bytes’ : 237095501 } , { ‘path’ : ‘/somecontainer/someprefix-8pQIF4w1GR’ , ‘etag’ : ‘e0d912fc4b88961c33ecfe70e64a7855’ , ‘size_bytes’ : 226289048 } , . . . ] |
或通过卷曲...
如果一切顺利,则应返回HTTP响应代码201。 为了进行验证,您可以打开IBM Bluemix Object Storage Service仪表板,并观察name_of_any_existing_container中“ name_of_file_with_any_extension”清单文件的创建。 它应该显示与所有分段文件的总和匹配的汇总大小。 现在可以单独引用此新清单文件,该文件表示61个单独的段文件的集合。 例如,在Jupyter笔记本中,我们使用类似于“ swift://name_of_any_existing_container.spark/name_of_file_with_any_extension”的语法加载数据。 甜!
需要对Swift Object Storage API Url进行精心设计的HTTP PUT请求,其中包括有效的X-Auth-Token请求标头,必需的名为X-Object-Manifest的请求标头和可选的带有值的Content-Length请求标头的0:
1 2 3 4 5 6 7 8 | PUT / v1 / AUTH_some – hex – value / name_of_any_existing_container / name_of_file_with_any_extension HTTP / 1.1 Host : dal . objectstorage . open . softlayer . com Content – Type : application / json X – Auth – Token : value – obtained – from – X – Subject – Token – Response – Header Content – Length : 0 X – Object – Manifest : name_of_container_which_holds_the_segmented_files / common_prefix_label_to_match_against_for_all_segmented_files Cache – Control : no – cache |
或通过卷曲...
如果一切顺利,则应返回HTTP响应代码201。 为了进行验证,现在可以单独引用这个新的零字节大小的清单文件,该文件代表61个单独的段文件的集合。 例如,在Jupyter笔记本中,我们使用类似于“ swift://name_of_any_existing_container.spark/name_of_file_with_any_extension”的语法加载数据。 这种方法的真正妙处在于,将来我们可以选择将第62段文件上传到相同的容器区域,如果我们遵循前面X-Object-Manifest标头中提供的公共前缀标签,那么清单将神奇地自动-包括新数据,无需对清单本身进行额外的编辑。 确实有动力!
任务完成!
Supporting Resources: Creating a special manifest to represent many segmented objects needn’t be hard within IBM Bluemix Object Storage. As we’ve seen, this provides the significant advantage of dealing with data that is larger than 5Gb in size – which is often the case for larger data workloads. However, keep in mind that manifest files can be created for segmented data files aggregating to any size. We’ve explored the pros and cons of creating Static or Dynamic Large Objects and shown the HTTP REST API mechanics to achieve either. Our team has created a to help with segmentation of large files into specified chunk sizes while avoiding mid-line splits. We recommend reading the IBM Bluemix Object Storage . We also encourage readers to learn about features found within the excellent , and more specifically the .
支持资源 :在IBM Bluemix Object Storage中创建一个特殊的清单来表示许多分段的对象并不是一件难事。 如我们所见,这提供了处理大于5Gb的数据的显着优势-大型数据工作负载通常是这种情况。 但是,请记住,可以为汇总为任意大小的分段数据文件创建清单文件。 我们探讨了创建静态或动态大型对象的利弊,并展示了HTTP REST API的机制。 我们的团队创建了一个以帮助将大型文件分割为指定的块大小,同时避免中间行分割。 我们建议阅读IBM Bluemix Object Storage 。 我们还鼓励读者学习出色的 ,尤其是 。
Easy Button: At this point, you may be wondering if there is a way to obtain a SLO manifest containing all of the segemented ETAG and size values in a JSON format or if the process is easier when the large file is available to you locally rather than our odd situation. The answer is an emphatic YES. The Python OpenStack Swift Client generally provides automatic manifest creation when uploading a single large file as illustrated below.
Easy Button :此时,您可能想知道是否有一种方法可以获取包含JSON格式的所有分段ETAG和大小值的SLO清单,或者在本地可以使用大文件时是否更容易处理比我们奇怪的情况。 答案是肯定的 。 如下图所示,当上传单个大文件时,Python OpenStack Swift Client通常提供自动清单创建。
Example: Locally stored large file needs to be uploaded to Object StorageApproach: Use the Python Swift Client upload feature with appropriate arguments.
示例:需要将本地存储的大文件上载到对象存储方法:使用带有适当参数的Python Swift Client上传功能。
SLO:
SLO:
my_local_large_file_with_some_extension segment 3 my_local_large_file_with_some_extension segment 1 my_local_large_file_with_some_extension segment 2 my_local_large_file_with_some_extension segment 0 my_local_large_file_with_some_extension/1443450560.000000/160872806/52428800/00000002 my_local_large_file_with_some_extension/1443450560.000000/160872806/52428800/00000003 my_local_large_file_with_some_extension/1443450560.000000/160872806/52428800/00000001 my_local_large_file_with_some_extension/1443450560.000000/160872806/52428800/00000000 my_local_large_file_with_some_extension
my_local_large_file_with_some_extension segment 3 my_local_large_file_with_some_extension segment 1 my_local_large_file_with_some_extension segment 2 my_local_large_file_with_some_extension segment 0 my_local_large_file_with_some_extension/1443450560.000000/160872806/52428800/00000002 my_local_large_file_with_some_extension/1443450560.000000/160872806/52428800/00000003 my_local_large_file_with_some_extension/1443450560.000000/160872806/52428800/00000001 my_local_large_file_with_some_extension/1443450560.000000/160872806/52428800/00000000 my_local_large_file_with_some_extension
1 1 2 2 3 3 4 4 5 5 6 6 7 7 8 8 9 9 10 10 11 11 | $ swift —os–auth–url=https://identity.open.softlayer.com/v3 —os–user–id=some_hex_value —os–password=“weird_characters” —os–project–id=another_hex_value —os–region–name=dallas –V 3 upload my_object_storage_container_name –S int_seg_size_in_bytes my_local_large_file_with_some_extension —use–slo $ swift — os – auth – url = https : / / identity .open .softlayer .com / v3 — os – user – id = some_hex_value — os – password = “weird_characters” — os – project – id = another_hex_value — os – region – name = dallas – V 3 upload my_object_storage_container_name – S int_seg_size_in_bytes my_local_large_file_with_some_extension — use – slo my_local_large_file_with_some_extension segment 3 my_local_large_file_with_some_extension segment 3 my_local_large_file_with_some_extension segment 1 my_local_large_file_with_some_extension segment 1 my_local_large_file_with_some_extension segment 2 my_local_large_file_with_some_extension segment 2 my_local_large_file_with_some_extension segment 0 my_local_large_file_with_some_extension segment 0 my_local_large_file_with_some_extension/1443450560.000000/160872806/52428800/00000002 my_local_large_file_with_some_extension / 1443450560.000000 / 160872806 / 52428800 / 00000002 my_local_large_file_with_some_extension/1443450560.000000/160872806/52428800/00000003 my_local_large_file_with_some_extension / 1443450560.000000 / 160872806 / 52428800 / 00000003 my_local_large_file_with_some_extension/1443450560.000000/160872806/52428800/00000001 my_local_large_file_with_some_extension / 1443450560.000000 / 160872806 / 52428800 / 00000001 my_local_large_file_with_some_extension/1443450560.000000/160872806/52428800/00000000 my_local_large_file_with_some_extension / 1443450560.000000 / 160872806 / 52428800 / 00000000 my_local_large_file_with_some_extension my_local_large_file_with_some_extension |
Two (2) things happen. A new container named my_object_storage_container_name_segments is created to hold the segmented files and a new manifest file named my_local_large_file_with_some_extension is generated. As discussed earlier, this manifest should show the aggregated size of all segments that it represents. If you’d like to grab a copy of this SLO manifest for additional hacking, version control or inspection … you’ll need to obtain a valid X-Auth-Token (described above) and issue a HTTP GET request with a modified query-string parameter of get:
两(2)件事发生。 将创建一个名为my_object_storage_container_name_segments的新容器来保存分段文件,并生成一个名为my_local_large_file_with_some_extension的新清单文件。 如前所述,此清单应显示其代表的所有分段的总大小。 如果您想获取此SLO清单的副本以进行其他黑客攻击,版本控制或检查…,则需要获取有效的X-Auth-Token(如上所述),并发出带有修改后的查询的HTTP GET请求- get的字符串参数:
1 1 | curl –X GET –H “Content-Type: text/csv” –H “X-Auth-Token: value-obtained-from-X-Subject-Token-Response-Header” –H “Cache-Control: no-cache” “https://dal.objectstorage.open.softlayer.com/v1/AUTH_some-hex-value/name_of_any_existing_container/name_of_file_with_any_extension?multipart-manifest=get” curl – X GET – H “Content-Type: text/csv” – H “X-Auth-Token: value-obtained-from-X-Subject-Token-Response-Header” – H “Cache-Control: no-cache” “https://dal.objectstorage.open.softlayer.com/v1/AUTH_some-hex-value/name_of_any_existing_container/name_of_file_with_any_extension?multipart-manifest=get” |
DLO:
DLO:
my_local_large_file_with_some_extension segment 3 my_local_large_file_with_some_extension segment 1 my_local_large_file_with_some_extension segment 2 my_local_large_file_with_some_extension segment 0
my_local_large_file_with_some_extension segment 3 my_local_large_file_with_some_extension segment 1 my_local_large_file_with_some_extension segment 2 my_local_large_file_with_some_extension segment 0
1 1 2 2 3 3 4 4 5 5 6 6 | $ swift —os–auth–url=https://identity.open.softlayer.com/v3 —os–user–id=some_hex_value —os–password=“weird_characters” —os–project–id=another_hex_value —os–region–name=dallas –V 3 upload my_object_storage_container_name –S int_seg_size_in_bytes my_local_large_file_with_some_extension $ swift — os – auth – url = https : / / identity .open .softlayer .com / v3 — os – user – id = some_hex_value — os – password = “weird_characters” — os – project – id = another_hex_value — os – region – name = dallas – V 3 upload my_object_storage_container_name – S int_seg_size_in_bytes my_local_large_file_with_some_extension my_local_large_file_with_some_extension segment 3 my_local_large_file_with_some_extension segment 3 my_local_large_file_with_some_extension segment 1 my_local_large_file_with_some_extension segment 1 my_local_large_file_with_some_extension segment 2 my_local_large_file_with_some_extension segment 2 my_local_large_file_with_some_extension segment 0 my_local_large_file_with_some_extension segment 0 |
Two (2) things happen. A new container named my_object_storage_container_name_segments is created to hold the segmented files and a new manifest file named my_local_large_file_with_some_extension is generated. As discussed earlier, this manifest is a zero-byte sized file and represents ALL files located within a single container that follow a described naming prefix convention.
两(2)件事发生。 将创建一个名为my_object_storage_container_name_segments的新容器来保存分段文件,并生成一个名为my_local_large_file_with_some_extension的新清单文件。 如前所述,此清单是一个零字节大小的文件,代表遵循描述的命名前缀约定的位于单个容器中的所有文件。
Food for Thought
思想的食物
In conclusion, whether you need a 100% representation using the Python OpenStack Swift Client upload feature or a partial representation via the OpenStack Storage APIs to facilitate large data analysis and more efficient notebook designs with faster processing times, you’ll be able to access the right size of data for your task.
总之,无论您需要使用Python OpenStack Swift客户端上载功能的100%表示形式,还是需要通过OpenStack Storage API的部分表示形式来促进大数据分析和更高效的笔记本设计以及更短的处理时间,您都可以访问适合您任务的数据大小。
Early in my career, specialized in melting plastic and debating with ISO auditors. Later, tested software test tools – envision a person measuring rulers in a ruler factory. After a promotion, I managed a team great at breaking software. I was also the test organization’s performance expert, assessing application throughput/speed and recommending fixes to make applications go faster. Later on, I worked on gluing non-IBM and IBM software together and showing customers how easy it was to do. As a facilitator to support the CEO’s office, I organized studies for our executive leadership by gathering people and steering chats to look at disruptive technologies and see where new money could be made. I’m currently a member of the amazing IBM jStart team. We explore the “art of the possible”, have an aversion for saying “it can’t be done” and love learning through direct client engagement. My general focus has been on cloud-related emerging technologies facilitated by our Cloud Foundry based Platform as a Service (PaaS) – IBM Bluemix™ Within that framework, my current technology adventure is with Apache Spark, lightning fast cluster computing, for Big Data analytics. I’ve travelled the world and enjoy experiencing new ideas. Curiosity keeps me creating and consuming. “If it can be, I will try” – Me
在我职业生涯的早期,专门研究塑料熔化和与ISO审核员进行辩论。 后来,经过测试的软件测试工具–设想在标尺工厂中测量标尺的人员。 晋升后,我管理了一支擅长于破坏软件的团队。 我还是测试组织的性能专家,评估应用程序的吞吐量/速度,并建议修复程序以使应用程序运行更快。 后来,我致力于将非IBM软件和IBM软件粘合在一起,并向客户展示了这样做的难度。 作为支持首席执行官办公室的推动者,我通过聚集人员和指导聊天来研究颠覆性技术,看看可以在哪里赚到新钱,从而组织了有关执行领导层的研究。 我目前是惊人的IBM jStart团队的成员。 我们探索“可能的艺术”,对“不可能完成”表示厌恶,并喜欢通过直接与客户互动来学习。 我的主要重点是通过基于Cloud Foundry的平台即服务(PaaS)– IBM Bluemix™促进的与云有关的新兴技术。在该框架内,我目前的技术历程是使用Apache Spark(闪电般的快速集群计算)进行大数据分析。 我环游世界,享受新想法。 好奇心使我不断创造和消费。 “如果可以,我会尝试的” –我
翻译自:
pluribus算法
转载地址:http://dhqwd.baihongyu.com/