ó ôBQc@sddlZddlZddlZddlmZddlmZddlmZddl m Z ddlm Z de fd„ƒYZ d e fd „ƒYZdeeed „Zd e fd „ƒYZde fd„ƒYZde fd„ƒYZde fd„ƒYZdS(iÿÿÿÿN(tBucketListingRef(tCommandException(tPluralityCheckableIterator(tStorageUriBuilder(tContainsWildcardtNameExpansionResultcBseeZdZd ed„Zd„Zd„Zd„Zd„Z d„Z d„Z d„Z d „Z RS( s- Holds one fully expanded result from iterating over NameExpansionIterator. The member data in this class need to be pickleable because NameExpansionResult instances are passed through Multiprocessing.Queue. In particular, don't include any boto state like StorageUri, since that pulls in a big tree of objects, some of which aren't pickleable (and even if they were, pickling/unpickling such a large object tree would result in significant overhead). The state held in this object is needed for handling the various naming cases (e.g., copying from a single source URI to a directory generates different dest URI names than copying multiple URIs to a directory, to be consistent with naming rules used by the Unix cp command). For more details see comments in _NameExpansionIterator. cCsC||_||_||_||_||_||_||_dS(s] Args: src_uri_str: string representation of StorageUri that was expanded. is_multi_src_request: bool indicator whether src_uri_str expanded to more than 1 BucketListingRef. src_uri_expands_to_multi: bool indicator whether the current src_uri expanded to more than 1 BucketListingRef. names_container: Bool indicator whether src_uri names a container. expanded_uri_str: string representation of StorageUri to which src_uri_str expands. have_existing_dst_container: bool indicator whether this is a copy request to an existing bucket, bucket subdir, or directory. Default None value should be used in cases where this is not needed (commands other than cp). is_latest: Bool indicating that the result represents the object's current version. N(t src_uri_strtis_multi_src_requesttsrc_uri_expands_to_multitnames_containertexpanded_uri_strthave_existing_dst_containert is_latest(tselfRRRR R R R ((s2/tmp/tmp.yUYbTOKr8o/gsutil/gslib/name_expansion.pyt__init__3s      cCs d|jS(Ns%s(R (R ((s2/tmp/tmp.yUYbTOKr8o/gsutil/gslib/name_expansion.pyt__repr__OscCs |jdkS(s2Returns True if name expansion yielded no matches.N(t expanded_blrtNone(R ((s2/tmp/tmp.yUYbTOKr8o/gsutil/gslib/name_expansion.pytIsEmptyRscCs|jS(sFReturns the string representation of the StorageUri that was expanded.(R(R ((s2/tmp/tmp.yUYbTOKr8o/gsutil/gslib/name_expansion.pyt GetSrcUriStrVscCs|jS(se Returns bool indicator whether name expansion resulted in more than 0 BucketListingRef. (R(R ((s2/tmp/tmp.yUYbTOKr8o/gsutil/gslib/name_expansion.pytIsMultiSrcRequestZscCs|jS(si Returns bool indicator whether the current src_uri expanded to more than 1 BucketListingRef (R(R ((s2/tmp/tmp.yUYbTOKr8o/gsutil/gslib/name_expansion.pytSrcUriExpandsToMultiascCs|jS(sd Returns bool indicator of whether src_uri names a directory, bucket, or bucket subdir. (R (R ((s2/tmp/tmp.yUYbTOKr8o/gsutil/gslib/name_expansion.pytNamesContainerhscCs|jS(s[ Returns the string representation of StorageUri to which src_uri_str expands. (R (R ((s2/tmp/tmp.yUYbTOKr8o/gsutil/gslib/name_expansion.pytGetExpandedUriStroscCs|jS(sReturns bool indicator whether this is a copy request to an existing bucket, bucket subdir, or directory, or None if not relevant.(R (R ((s2/tmp/tmp.yUYbTOKr8o/gsutil/gslib/name_expansion.pytHaveExistingDstContainervsN(t__name__t __module__t__doc__RtFalseRRRRRRRRR(((s2/tmp/tmp.yUYbTOKr8o/gsutil/gslib/name_expansion.pyR!s       t_NameExpansionIteratorcBs5eZdZdeeed„Zd„Zd„ZRS(sö Iterates over all src_uris, expanding wildcards, object-less bucket names, subdir bucket names, and directory names, generating a flat listing of all the matching objects/files. You should instantiate this object using the static factory function NameExpansionIterator, because consumers of this iterator need the PluralityCheckableIterator wrapper built by that function. Yields: gslib.name_expansion.NameExpansionResult. Raises: CommandException: if errors encountered. c Cs‡||_||_||_||_||_t||ƒ|_||_||_||_ | |_ | |_ idt 6dt 6|_dS(sA Args: command_name: name of command being run. proj_id_handler: ProjectIdHandler to use for current command. headers: Dictionary containing optional HTTP headers to pass to boto. debug: Debug level to pass in to boto connection (range 0..3). bucket_storage_uri_class: Class to instantiate for cloud StorageUris. Settable for testing/mocking. uri_strs: PluralityCheckableIterator of URI strings needing expansion. recursion_requested: True if -R specified on command-line. have_existing_dst_container: Bool indicator whether this is a copy request to an existing bucket, bucket subdir, or directory. Default None value should be used in cases where this is not needed (commands other than cp). flat: Bool indicating whether bucket listings should be flattened, i.e., so the mapped-to results contain objects spanning subdirectories. all_versions: Bool indicating whether to iterate over all object versions. for_all_version_delete: Bool indicating whether this is for an all-version delete. Examples of _NameExpansionIterator with flat=True: - Calling with one of the uri_strs being 'gs://bucket' will enumerate all top-level objects, as will 'gs://bucket/' and 'gs://bucket/*'. - 'gs://bucket/**' will enumerate all objects in the bucket. - 'gs://bucket/abc' will enumerate all next-level objects under directory abc (i.e., not including subdirectories of abc) if gs://bucket/abc/* matches any objects; otherwise it will enumerate the single name gs://bucket/abc - 'gs://bucket/abc/**' will enumerate all objects under abc or any of its subdirectories. - 'file:///tmp' will enumerate all files under /tmp, as will 'file:///tmp/*' - 'file:///tmp/**' will enumerate all files under /tmp or any of its subdirectories. Example if flat=False: calling with gs://bucket/abc/* lists matching objects or subdirs, but not sub-subdirs or objects beneath subdirs. Note: In step-by-step comments below we give examples assuming there's a gs://bucket with object paths: abcd/o1.txt abcd/o2.txt xyz/o1.txt xyz/o2.txt and a directory file://dir with file paths: dir/a.txt dir/b.txt dir/c/ s**t*N(t command_nametproj_id_handlertheaderstdebugtbucket_storage_uri_classRt suri_builderturi_strstrecursion_requestedR tflatt all_versionstTrueRt_flatness_wildcard( R RR R!R"R#R%R&R R'R(tfor_all_version_delete((s2/tmp/tmp.yUYbTOKr8o/gsutil/gslib/name_expansion.pyRŽs5          ccs­x¦|jD]›}t|ƒr.|j|ƒ}n'|jj|ƒ}tt|ƒgƒ}t|ƒ}|jr‹|j r‹t |||jƒ}n0|j r¯t ||d|j ƒ}n t|ƒ}t|ƒ}g}|j|j}|jƒpò|jƒ}|jjƒp|}|jƒr)td|ƒ‚nxy|D]q\} } | jƒjƒ rš|jse| jƒ ršt|||| | jƒ|jd| jƒƒVq0n|j så| jƒjƒr¾d} nd} d| | jƒ|jfGHq0n| jƒjƒrd| jƒ|f} n| jƒj|ƒ} t|j| ƒƒ} |pI| jƒ}|jjƒp^|}x=| D]5} t|||t| jƒ|jd| jƒƒVqhWq0Wq WdS(NR!sNo URIs matched: %sR t directorytbuckets-Omitting %s "%s". (Did you mean to do %s -R?)s%s/%s(R%Rt_WildcardIteratorR$t StorageUrititerRRR'R&t_ImplicitBucketSubdirIteratorR(t_AllVersionIteratorR!t_NonContainerTuplifyIteratorR*t has_pluralitytis_emptyRtGetUriR t HasPrefixRt GetUriStringR tIsLatestt is_file_uriRtclone_replace_nameR)(R turi_strtpost_step1_itertsuritpost_step2_itertexp_src_bucket_listing_refstwcRRR tblrtdescturi_to_iteratetwc_iter((s2/tmp/tmp.yUYbTOKr8o/gsutil/gslib/name_expansion.pyt__iter__Ósj                   c Cs7tj||jd|jd|jd|jd|jƒS(s  Helper to instantiate gslib.WildcardIterator. Args are same as gslib.WildcardIterator interface, but this method fills in most of the values from instance state. Args: uri_or_str: StorageUri or URI string naming wildcard objects to iterate. R#R!R"R((twildcard_iteratorR R#R!R"R((R t uri_or_str((s2/tmp/tmp.yUYbTOKr8o/gsutil/gslib/name_expansion.pyR.s   N( RRRRR)RRRFR.(((s2/tmp/tmp.yUYbTOKr8o/gsutil/gslib/name_expansion.pyR}s B Lc Csgt|ƒ}t|||||||||d| d| ƒ } t| ƒ} | jƒrctdƒ‚n| S(sÙ Static factory function for instantiating _NameExpansionIterator, which wraps the resulting iterator in a PluralityCheckableIterator and checks that it is non-empty. Also, allows uri_strs can be either an array or an iterator. Args: command_name: name of command being run. proj_id_handler: ProjectIdHandler to use for current command. headers: Dictionary containing optional HTTP headers to pass to boto. debug: Debug level to pass in to boto connection (range 0..3). bucket_storage_uri_class: Class to instantiate for cloud StorageUris. Settable for testing/mocking. uri_strs: PluralityCheckableIterator of URI strings needing expansion. recursion_requested: True if -R specified on command-line. have_existing_dst_container: Bool indicator whether this is a copy request to an existing bucket, bucket subdir, or directory. Default None value should be used in cases where this is not needed (commands other than cp). flat: Bool indicating whether bucket listings should be flattened, i.e., so the mapped-to results contain objects spanning subdirectories. all_versions: Bool indicating whether to iterate over all object versions. for_all_version_delete: Bool indicating whether this is for an all-version delete. Examples of ExpandWildcardsAndContainers with flat=True: - Calling with one of the uri_strs being 'gs://bucket' will enumerate all top-level objects, as will 'gs://bucket/' and 'gs://bucket/*'. - 'gs://bucket/**' will enumerate all objects in the bucket. - 'gs://bucket/abc' will enumerate all next-level objects under directory abc (i.e., not including subdirectories of abc) if gs://bucket/abc/* matches any objects; otherwise it will enumerate the single name gs://bucket/abc - 'gs://bucket/abc/**' will enumerate all objects under abc or any of its subdirectories. - 'file:///tmp' will enumerate all files under /tmp, as will 'file:///tmp/*' - 'file:///tmp/**' will enumerate all files under /tmp or any of its subdirectories. Example if flat=False: calling with gs://bucket/abc/* lists matching objects or subdirs, but not sub-subdirs or objects beneath subdirs. Note: In step-by-step comments below we give examples assuming there's a gs://bucket with object paths: abcd/o1.txt abcd/o2.txt xyz/o1.txt xyz/o2.txt and a directory file://dir with file paths: dir/a.txt dir/b.txt dir/c/ R(R+sNo URIs matched(RRR5R( RR R!R"R#R%R&R R'R(R+tname_expansion_iterator((s2/tmp/tmp.yUYbTOKr8o/gsutil/gslib/name_expansion.pytNameExpansionIterator/s<   tNameExpansionIteratorQueuecBs‰eZdZd„Zd„Zd„Zd„Zd d d d„Zd„Z d d d„Z d„Z d „Z d „Z d „Zd „ZRS(s! Wrapper around NameExpansionIterator that provides a Multiprocessing.Queue facade. Only a blocking get() function can be called, and the block and timeout params on that function are ignored. All other class functions raise NotImplementedError. This class is thread safe. cCs%||_||_tjƒ|_dS(N(RIt final_valuet threadingtLocktlock(R RIRL((s2/tmp/tmp.yUYbTOKr8o/gsutil/gslib/name_expansion.pyR‚s  cCstdƒ‚dS(Ns2NameExpansionIteratorQueue.qsize() not implemented(tNotImplementedError(R ((s2/tmp/tmp.yUYbTOKr8o/gsutil/gslib/name_expansion.pytqsize‡scCstdƒ‚dS(Ns2NameExpansionIteratorQueue.empty() not implemented(RP(R ((s2/tmp/tmp.yUYbTOKr8o/gsutil/gslib/name_expansion.pytempty‹scCstdƒ‚dS(Ns1NameExpansionIteratorQueue.full() not implemented(RP(R ((s2/tmp/tmp.yUYbTOKr8o/gsutil/gslib/name_expansion.pytfullscCstdƒ‚dS(Ns0NameExpansionIteratorQueue.put() not implemented(RP(R tobjtblockttimeout((s2/tmp/tmp.yUYbTOKr8o/gsutil/gslib/name_expansion.pytput“scCstdƒ‚dS(Ns7NameExpansionIteratorQueue.put_nowait() not implemented(RP(R RT((s2/tmp/tmp.yUYbTOKr8o/gsutil/gslib/name_expansion.pyt put_nowait—scCsI|jjƒz'|jjƒr&|jS|jjƒSWd|jjƒXdS(N(ROtacquireRIR5RLtnexttrelease(R RURV((s2/tmp/tmp.yUYbTOKr8o/gsutil/gslib/name_expansion.pytget›s  cCstdƒ‚dS(Ns7NameExpansionIteratorQueue.get_nowait() not implemented(RP(R ((s2/tmp/tmp.yUYbTOKr8o/gsutil/gslib/name_expansion.pyt get_nowait¤scCstdƒ‚dS(Ns8NameExpansionIteratorQueue.get_no_wait() not implemented(RP(R ((s2/tmp/tmp.yUYbTOKr8o/gsutil/gslib/name_expansion.pyt get_no_wait¨scCstdƒ‚dS(Ns2NameExpansionIteratorQueue.close() not implemented(RP(R ((s2/tmp/tmp.yUYbTOKr8o/gsutil/gslib/name_expansion.pytclose¬scCstdƒ‚dS(Ns8NameExpansionIteratorQueue.join_thread() not implemented(RP(R ((s2/tmp/tmp.yUYbTOKr8o/gsutil/gslib/name_expansion.pyt join_thread°scCstdƒ‚dS(Ns?NameExpansionIteratorQueue.cancel_join_thread() not implemented(RP(R ((s2/tmp/tmp.yUYbTOKr8o/gsutil/gslib/name_expansion.pytcancel_join_thread´sN(RRRRRQRRRSRRWRXR\R]R^R_R`Ra(((s2/tmp/tmp.yUYbTOKr8o/gsutil/gslib/name_expansion.pyRKvs          R3cBs eZdZd„Zd„ZRS(s¼ Iterator that produces the tuple (False, blr) for each iteration of blr_iter. Used for cases where blr_iter iterates over a set of BucketListingRefs known not to name containers. cCs ||_dS(s= Args: blr_iter: iterator of BucketListingRef. N(tblr_iter(R Rb((s2/tmp/tmp.yUYbTOKr8o/gsutil/gslib/name_expansion.pyRÀsccs#x|jD]}t|fVq WdS(N(RbR(R RB((s2/tmp/tmp.yUYbTOKr8o/gsutil/gslib/name_expansion.pyRFÇs(RRRRRF(((s2/tmp/tmp.yUYbTOKr8o/gsutil/gslib/name_expansion.pyR3¹s R1cBs eZdZd„Zd„ZRS(s2 Iterator wrapper that iterates over blr_iter, performing implicit bucket subdir expansion. Each iteration yields tuple (names_container, expanded BucketListingRefs) where names_container is true if URI names a directory, bucket, or bucket subdir (vs how StorageUri.names_container() doesn't handle latter case). For example, iterating over [BucketListingRef("gs://abc")] would expand to: [BucketListingRef("gs://abc/o1"), BucketListingRef("gs://abc/o2")] if those subdir objects exist, and [BucketListingRef("gs://abc") otherwise. cCs||_||_||_dS(s  Args: name_expansion_instance: calling instance of NameExpansion class. blr_iter: iterator of BucketListingRef. flat: bool indicating whether bucket listings should be flattened, i.e., so the mapped-to results contain objects spanning subdirectories. N(Rbtname_expansion_instanceR'(R RcRbR'((s2/tmp/tmp.yUYbTOKr8o/gsutil/gslib/name_expansion.pyRÜs  ccs¾x·|jD]¬}|jƒ}|jƒr«t|jj|jjjd|jj dƒ|jj |j fƒƒƒ}|j ƒsx'|D]}t |fVq…Wq¶t|fVq t|fVq WdS(Ns%s/%st/(RbR6t names_objectRRcR.R$R/turitrstripR*R'R5R)R(R RBRftimplicit_subdir_iteratortexp_blr((s2/tmp/tmp.yUYbTOKr8o/gsutil/gslib/name_expansion.pyRFès       (RRRRRF(((s2/tmp/tmp.yUYbTOKr8o/gsutil/gslib/name_expansion.pyR1Ìs R2cBs#eZdZdd„Zd„ZRS(sO Iterator wrapper that iterates over blr_iter, performing implicit version expansion. Output behavior is identical to that in _ImplicitBucketSubdirIterator above. For example, iterating over [BucketListingRef("gs://abc/o1")] would expand to: [BucketListingRef("gs://abc/o1#1234"), BucketListingRef("gs://abc/o1#1235")] cCs||_||_||_dS(s  Args: name_expansion_instance: calling instance of NameExpansion class. blr_iter: iterator of BucketListingRef. flat: bool indicating whether bucket listings should be flattened, i.e., so the mapped-to results contain objects spanning subdirectories. N(RbRcR!(R RcRbR!((s2/tmp/tmp.yUYbTOKr8o/gsutil/gslib/name_expansion.pyRs  ccsÐt}xÃ|jD]¸}|jƒ}|jƒsCt}t|fVPnxn|jd|jd|jdtƒD]H}|j|jkr„Pnt |j |ƒd|ƒ}t}t|fVqhW|rt|fVqqWdS(NtprefixR!R(tkey( R)RbR6ReRt list_buckett object_nameR!tnameRtclone_replace_key(R RRRBRfRkt version_blr((s2/tmp/tmp.yUYbTOKr8o/gsutil/gslib/name_expansion.pyRFs     N(RRRRRRF(((s2/tmp/tmp.yUYbTOKr8o/gsutil/gslib/name_expansion.pyR2ûs  (tcopyRMRGtbucket_listing_refRtgslib.exceptionRt"gslib.plurality_checkable_iteratorRtgslib.storage_uri_builderRRtobjectRRRR)RRJRKR3R1R2(((s2/tmp/tmp.yUYbTOKr8o/gsutil/gslib/name_expansion.pyts    \µ BC/